• Stars
    star
    358
  • Rank 118,855 (Top 3 %)
  • Language
    Jupyter Notebook
  • License
    BSD 2-Clause "Sim...
  • Created over 10 years ago
  • Updated over 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

IPython tool to report memory usage deltas for every command you type

ipython_memory_usage

IPython tool to report memory usage deltas for every command you type. If you are running out of RAM then use this tool to understand what's happening. It also records the time spent running each command.

This tool helps you to figure out which commands use a lot of RAM and take a long time to run, this is very useful if you're working with large numpy matrices. In addition it reports the peak memory usage whilst a command is running which might be higher (due to temporary objects) than the final RAM usage. Built on @fabianp's memory_profiler.

As a simple example - make 10,000,000 random numbers, report that it costs 76MB of RAM and took 0.3 seconds to execute:

In [3]: arr=np.random.uniform(size=int(1e7))
'arr=np.random.uniform(size=int(1e7))' used 76.2578 MiB RAM in 0.33s, peaked 0.00 MiB above current, total RAM usage 107.37 MiB

Francesc Alted has a fork with more memory delta details, see it here: https://github.com/FrancescAlted/ipython_memwatcher

For a demo using numpy and Pandas take a look at examples/example_usage_np_pd.ipynb.

Setup

Supported: Python 3.8+ and IPython 7.9+

Simple:

$ pip install ipython_memory_usage

via https://pypi.org/project/ipython-memory-usage/

$ conda install -c conda-forge ipython_memory_usage

via https://anaconda.org/conda-forge/ipython_memory_usage

OR

Take a copy of the code or fork from https://github.com/ianozsvald/ipython_memory_usage and then:

$ python setup.py install

If you pull it from github and you want to develop on it, it is easier to make a link in site-packages and develop it locally with:

$ python setup.py develop 

To uninstall:

$ pip uninstall ipython_memory_usage

Example usage

We can measure on every line how large array operations allocate and deallocate memory:

using with magic:

$ ipython
In [1]: import ipython_memory_usage 
# note that help(ipython_memory_usage) will give you some clues
In [1] %ipython_memory_usage_start                                                                                 

Out[1]: 'memory profile enabled'
In [1] used 0.2383 MiB RAM in 0.11s, peaked 0.00 MiB above current, total RAM usage 47.64 MiB

In [2]: import numpy as np 
...: a = np.ones(int(1e7))                                                                                       
In [2] used 85.9180 MiB RAM in 0.22s, peaked 0.00 MiB above current, total RAM usage 133.56 MiB

In [3]: %ipython_memory_usage_stop                                                                                  
Out[3]: 'memory profile disabled'

In [4]: a = np.ones(int(1e7))  

using with function call:

$ ipython
Python 3.4.3 |Anaconda 2.3.0 (64-bit)| (default, Jun  4 2015, 15:29:08) 
IPython 3.2.0 -- An enhanced Interactive Python.

In [1]: import ipython_memory_usage.ipython_memory_usage as imu
In [2]: import numpy as np

In [3]: imu.start_watching_memory()
In [3] used 0.0469 MiB RAM in 7.32s, peaked 0.00 MiB above current, total RAM usage 56.88 MiB

In [4]: a = np.ones(int(1e7))
In [4] used 76.3750 MiB RAM in 0.14s, peaked 0.00 MiB above current, total RAM usage 133.25 MiB

In [5]: del a
In [5] used -76.2031 MiB RAM in 0.10s, total RAM usage 57.05 MiB

In [6]: imu.stop_watching_memory()

In [7]: b = np.ones(int(1e7))

In [8]: b[0] * 5.0
Out[8]: 5.0

For the beginner with numpy it can be easy to work on copies of matrices which use a large amount of RAM. The following example sets the scene and then shows an in-place low-RAM variant.

First we make a random square array and modify it twice using copies taking 2.3GB RAM:

In [1]: imu.start_watching_memory()
In [2]: a = np.random.random((int(1e4),int(1e4)))
In [2] used 762.9531 MiB RAM in 2.21s, peaked 0.00 MiB above current, total RAM usage 812.30 MiB

In [3]: b = a*2
In [3] used 762.9492 MiB RAM in 0.51s, peaked 0.00 MiB above current, total RAM usage 1575.25 MiB

In [4]: c = np.sqrt(b)
In [4] used 762.9609 MiB RAM in 0.91s, peaked 0.00 MiB above current, total RAM usage 2338.21 MiB

Now we do the same operations but in-place on a, using 813MB RAM in total:

In [2]: a = np.random.random((int(1e4),int(1e4)))
In [2] used 762.9531 MiB RAM in 2.21s, peaked 0.00 MiB above current, total RAM usage 812.30 MiB
In [3]: a *= 2
In [3] used 0.0078 MiB RAM in 0.21s, peaked 0.00 MiB above current, total RAM usage 812.30 MiB
In [4]: a = np.sqrt(a, out=a)
In [4] used 0.0859 MiB RAM in 0.71s, peaked 0.00 MiB above current, total RAM usage 813.46 MiB

Lots of numpy functions have in-place operations that can assign their result back into themselves (see the out argument): http://docs.scipy.org/doc/numpy/reference/ufuncs.html#available-ufuncs

If we make a large 1.5GB array of random integers we can sqrt in-place using two approaches or assign the result to a new object b which doubles the RAM usage:

In [2]: a = np.random.randint(low=0, high=5, size=(10000, 20000))
In [2] used 1525.8984 MiB RAM in 6.51s, peaked 0.00 MiB above current, total RAM usage 1575.26 MiB

In [3]: a = np.sqrt(a)
In [3] used 0.097 MiB RAM in 1.53s, peaked 1442.92 MiB above current, total RAM usage 1576.21 MiB

In [4]: a = np.sqrt(a, out=a)
In [4] used 0.0234 MiB RAM in 0.51s, peaked 0.00 MiB above current, total RAM usage 1575.44 MiB

In [5]: b = np.sqrt(a)
In [5] used 1525.8828 MiB RAM in 1.27s, peaked 0.00 MiB above current, total RAM usage 3101.32 MiB

Newer versions of Numpy use temporary objects which provide memory optimisation, see https://docs.scipy.org/doc/numpy-1.13.0/release.html

We see this behaviour in the output below. Prior to version 1.13 we would see a peak memory greater than 0.00MiB above current. Older versions of Numpy and Windows will precipitate differing memory usage due to temporary matrices.

In [2]: a = np.ones(int(1e8)); b = np.ones(int(1e8)); c = np.ones(int(1e8))
In [2] used 2288.8750 MiB RAM in 1.02s, peaked 0.00 MiB above current, total RAM usage 2338.06 MiB

In [3]: d = a * b + c
In [3] used 762.9453 MiB RAM in 0.71s, peaked 0.00 MiB above current, total RAM usage 3101.01 MiB

Knowing that a temporary is created, we can do an in-place operation instead for the same result but a lower overall RAM footprint:

In [2]: a = np.ones(int(1e8)); b = np.ones(int(1e8)); c = np.ones(int(1e8))
In [2] used 2288.8750 MiB RAM in 1.02s, peaked 0.00 MiB above current, total RAM usage 2338.06 MiB

In [3]: d = a * b
In [3] used 762.9453 MiB RAM in 0.49s, peaked 0.00 MiB above current, total RAM usage 3101.00 MiB

In [4]: d += c
In [4] used 0.0000 MiB RAM in 0.25s, peaked 0.00 MiB above current, total RAM usage 3101.00 MiB

For more on this example see Tip at http://docs.scipy.org/doc/numpy/reference/ufuncs.html#available-ufuncs .

Important RAM usage note

It is much easier to debug RAM situations with a fresh IPython shell. The longer you use your current shell, the more objects remain inside it and the more RAM the Operating System may have reserved. RAM is returned to the OS slowly, so you can end up with a large process with plenty of spare internal RAM (which will be allocated to your large objects), so this tool (via memory_profiler) reports 0MB RAM usage. If you get confused or don't trust the results, quit IPython and start a fresh shell, then run the fewest commands you need to understand how RAM is added to the process.

Experimental perf stat report to monitor caching

I've added experimental support for the perf stat tool on Linux. To use it make sure that perf stat runs at the command line first. Experimental support of the cache-misses event is enabled in this variant script (to use this cd src/ipython_memory_usage first):

Python 3.4.3 |Anaconda 2.3.0 (64-bit)| (default, Jun  4 2015, 15:29:08) 
IPython 3.2.0 -- An enhanced Interactive Python.
In [1]: %run -i ipython_memory_usage_perf.py
In [2]: start_watching_memory()

Here's an example that builds on the previous ones. We build a square matrix with C ordering, we also need a 1D vector of the same size:

In [3]: ones_c = np.ones((int(1e4),int(1e4)))
In [4]: v = np.ones(int(1e4))

Next we run %timeit using all the data in row 0. The data will reasonably fit into a cache as v.nbytes == 80000 (80 kilobytes) and my L3 cache is 6MB. The report perf value for cache-misses averages to 8,823/second shows an average of 8k cache misses per seconds during this operation (followed by all the raw sampled events for reference). %timeit shows that this operation cost 14 microseconds per loop:

In [5]: %timeit v * ones_c[0, :]
run_capture_perf running: perf stat --pid 4978 --event cache-misses -I 100
100000 loops, best of 3: 14.9 µs per loop
In [6] used 0.1875 MiB RAM in 6.27s, peaked 0.00 MiB above current, total RAM usage 812.54 MiB
perf value for cache-misses averages to 8,823/second, raw samples: [6273.0, 382.0, 441.0, 1103.0, 632.0, 1314.0, 180.0, 451.0, 189.0, 540.0, 159.0, 1632.0, 285.0, 949.0, 408.0, 79.0, 448.0, 1167.0, 505.0, 350.0, 79.0, 172.0, 683.0, 2185.0, 1151.0, 170.0, 716.0, 2224.0, 572.0, 1708.0, 314.0, 572.0, 21.0, 209.0, 498.0, 839.0, 955.0, 233.0, 202.0, 797.0, 88.0, 185.0, 1663.0, 450.0, 352.0, 739.0, 4413.0, 1810.0, 1852.0, 550.0, 135.0, 389.0, 334.0, 235.0, 1922.0, 658.0, 233.0, 266.0, 170.0, 2198.0, 222.0, 4702.0]

We can run the same code using alternative indexing - for column 0 we get all the row elements, this means we have to fetch the column but it is stored in row-order, so each long row goes into the cache to use just one element. Now %timeit reports 210 microseconds per loop which is an order of magnitude slower than before, on average we have 474k cache misses per second. This column-ordered method of indexing the data is far less cache-friendly than the previous (row-ordered) method.

In [5]: %timeit v * ones_c[:, 0]
run_capture_perf running: perf stat --pid 4978 --event cache-misses -I 100
1000 loops, best of 3: 210 µs per loop
In [5] used 0.0156 MiB RAM in 1.01s, peaked 0.00 MiB above current, total RAM usage 812.55 MiB
perf value for cache-misses averages to 474,771/second, raw samples: [77253.0, 49168.0, 48660.0, 53147.0, 52532.0, 56546.0, 50128.0, 48890.0, 43623.0]

If the sample-gathering happens too quickly then an artifical pause is added, this means that IPython can pause for a fraction of a second which inevitably causes cache misses (as the CPU is being using and IPython is running an event loop). You can witness the baseline cache misses using pass:

In [9]: pass
run_capture_perf running: perf stat --pid 4978 --event cache-misses -I 100
PAUSING to get perf sample for 0.3s
In [9] used 0.0039 MiB RAM in 0.13s, peaked 0.00 MiB above current, total RAM usage 812.57 MiB
perf value for cache-misses averages to 131,611/second, raw samples: [14111.0, 3481.0]

NOTE that this is experimental, it is only known to work on Ian's laptop using Ubuntu Linux (perf doesn't exist on Mac or Windows). There are some tests for the perf parsing code, run nosetests perf_process.py to confirm these work ok and validate with your own perf output. I'm using perf version 3.11.0-12. Inside perf_process.py the EVENT_TYPE can be substituted to other events like stalled-cycles-frontend (exit IPython and restart to make sure the run-time is good - this code is hacky!).

To trial the code run $ python perf_process.py, this is useful for interactive development.

Requirements

Tested on

  • IPython 7.9 with Python 3.7 on OS X 10.14.6 (2019-11)
  • IPython 7.9 with Python 3.8 on Windows 64bit (2019-11)
  • IPython 7.9 with Python 3.7 on Windows 64bit (2019-11)
  • IPython 3.6 with Python 3.6 on Linux 64bit and Macs (2018-04)
  • IPython 3.2 with Python 3.4 on Linux 64bit (2015-06)

Developer installation notes

These notes are for the Man AHL 2019 Hackathon.

conda create -n hackathon_ipython_memory_usage python=3.7
conda activate hackathon_ipython_memory_usage
conda install ipython numpy memory_profiler

mkdir hackathon_ipython_memory_usage
cd hackathon_ipython_memory_usage/
git clone [email protected]:ianozsvald/ipython_memory_usage.git

# note "develop" and not the usual "install" here, to make the local folder editable!
python setup.py develop 

# now run ipython and follow the examples from further above in this README
# make a development environment
$ mkdir ipython_memory_usage_dev
$ cd ipython_memory_usage_dev/
$ conda create -n ipython_memory_usage_dev python=3.9 ipython jupyter memory_profiler numpy pandas
$ conda activate ipython_memory_usage_dev
git clone [email protected]:ianozsvald/ipython_memory_usage.git

# note "develop" and not the usual "install" here, to make the local folder editable!
$ python setup.py develop 

# now run ipython and follow the examples from further above in this README

Acknowledgements

Many thanks to https://github.com/manahl/ for hosting their 2019-11 hackathon. Here we removed old Python 2.x code, added an IPython magic, validated that Python 3.8 is supported and (very nearly) have a working conda recipe. Thanks to my colleagues:

Many thanks to https://github.com/manahl/ for hosting a hackathon (2018-04) that led to us publishing ipython_memory_usage to PyPi: https://pypi.org/project/ipython-memory-usage/ . Props to my colleagues for helping me fix the docs and upload to PyPI:

TO FIX

  • merge perf variation into the main variation as some sort of plugin (so it doesn't interfere if per not installed or available)
  • possibly try to add a counter for the size of the garbage collector, to see how many temp objects are made (disable gc first) on each command?

Problems

  • I can't figure out how to hook into live In prompt (at least - I can for static output, not for a dynamic output - see the code and the commented out blocks referring to watch_memory_prompt)
  • python setup.py develop will give you a sym-link from your environment back to this development folder, do this if you'd like to work on the project

Notes to Ian

To push to PyPI I need to follow https://docs.python.org/3/distributing/index.html#distributing-index - specifically python setup.py sdist and twine upload dist/*. This uses https://pypi.org/project/twine/ .

More Repositories

1

data_science_delivered

Observations from Ian on successfully delivering data science products
Jupyter Notebook
539
star
2

EuroPython2011_HighPerformanceComputing

Code for High Performance Computing tutorial for EuroPython 2011
Python
100
star
3

dtype_diet

Tries to shrink your Pandas column dtypes with no data loss so you have more spare RAM
Python
77
star
4

featherweight_web_api

Featherweight web API provider for serving R&D methods as web functions
Python
66
star
5

ark-tweet-nlp-python

Simple Python wrapper around runTagger.sh of ark-tweet-nlp
Python
63
star
6

twitter-social-graph-networkx

Download, summarise and visualise the followers in a small twitter social network
Python
59
star
7

HighPerformancePython_PyCon2012

Python
58
star
8

social_media_brand_disambiguator

Brand disambiguator for tweets to differentiate e.g. Orange vs orange (brand vs foodstuff), using NLTK and scikit-learn
Python
57
star
9

beyond_correlation

Exploratory code to see if we can learn about feature relationships in a DataFrame using machine learning
Python
54
star
10

notes_to_self

Python
40
star
11

twitter_networkx_concept_map

Take streaming tweets, extract hashtags & usernames, create graph, export graphml for Gephi visualisation
Python
33
star
12

string_distance_metrics

A set of Python string distance metrics for string distance comparisons
Python
27
star
13

learning_text_transformer_demo

Demo code for learning_text_transformer
Python
25
star
14

pyconuk_using_sklearn_classification

PyConUK 2016 talk - Using Machine Learning to solve a classification problem with scikit-learn - a practical walkthrough
Jupyter Notebook
18
star
15

pycon2013_applied_parallel_computing

Applied Parallel Computing tutorial material for PyCon 2013 (Minesh Amin, Ian Ozsvald)
C
17
star
16

learning_text_transformer

Search 'from' and 'to' strings to learn a text cleaning mapping
Python
17
star
17

python_template_with_config

Python coding template including ENV environment variable configuration
Python
15
star
18

euroscipy2014_highperformancepython

Python
13
star
19

ParallelPython_EuroSciPy2012

Starting code & solutions for EuroSciPy Paralllel Python 2 hour tutorial
Python
11
star
20

Python6hrsTutorial

6 hour Python tutorial slides (open sourced for anyone to re-use)
Python
9
star
21

callgraph_james_powell

Code from James for a callgraph monitor
Python
9
star
22

mot_pandas2_polars_dask

Investigation for PyDataLondon 2023 and ODSC 2023 conference comparing Pandas 2, Polars and Dask
HTML
8
star
23

Mandelbrot_pyCUDA_Cython_Numpy

Compare execution speeds of Mandelbrot using pyCUDA, Cython, Numpy (with Didrik Pinte @ Enthought)
7
star
24

plaquereader

Automatic English Heritage plaque transcriber for the OpenPlaques project
Python
5
star
25

london_oyster_pdf_to_dataframe_parser

Convert London Oytser rail+tube+bus PDF journey histories into a Pandas DataFrame as an HDF5 file
Python
4
star
26

talkpay_tweet_visualisataion

Viz of #talkpay pay tweets during May 2015
Python
4
star
27

research_module_layout_template

Python module structure for Jupyter Notebooks for R&D (e.g. Machine Learning) and collaboration
Jupyter Notebook
4
star
28

PyDataBudapest202004

"Making Pandas Fly" talk on data science at PyDataBudapest
Jupyter Notebook
4
star
29

oyster-scraper

Scrape the London Oyster website for Tube travel data with Python (forked from markk)
Python
3
star
30

kaggle_titanic_ipythonnotebook_boilerplate

Boilerplate example using IPython Notebook to solve simplest (sex-field only) Titanic challenge for Kaggle (this will get you started wtih the Kaggle competition)
3
star
31

ocr_aicookbook

Optical Character Recognition server behind bottle.py interface
3
star
32

pylondinium2019

Jupyter Notebook
3
star
33

SocialMicroprinter

CBM 231 drivers for Aruino+WiShield to turn a receipt printer into a remote social microprinter
Python
3
star
34

dataframe_visualiser

Make simple high-level visualisations of a Pandas dataframe to help when starting new data projects
Python
3
star
35

uplift_experiment

Jupyter Notebook
2
star
36

churn_a_b_tests_statistical_power

Jupyter Notebook
2
star
37

explore_cpu_cache_effect

Understanding my CPU's L3 cache using numpy vectorized operations
Python
2
star
38

dtype_pandas_numpy_speed_test

Speed test on Pandas or NumPy
Python
2
star
39

LanyrdViewerUsingProtoVis

Visualise the social graph behind Lanyrd events with an interactive JS graph
2
star
40

morris_counter

Probabilistic Morris Counter (counts 2^n using e.g. just a byte)
Python
2
star
41

Dell_E6420_Touchpad_AutoDisabler

Touchpad disabled automatically when typing
Python
2
star
42

libccv_bbfdetect_pywrapper

Python wrapper around command line call to libccv's bbfdetect face detector
Python
2
star
43

example_conversion_of_excel_to_pandas

Convert an XLS file via Pandas, replicate some logic, add string handling logic, output an XLSX
Jupyter Notebook
1
star
44

libccv_face_detect_directory

Simple for loop on directory of images to count nbr of faces found using libccv's bbfdetect
Python
1
star
45

kaggle_optiver_volatility

Jupyter Notebook
1
star
46

pydata2020global

Conference talk material
Jupyter Notebook
1
star
47

dataframes_alike

Python
1
star
48

mongocsvimporter

Typed (annotated) CSV importer for Mongo (based on mongoimporter)
Python
1
star
49

matplotlib_heightmap_2d_3d

Plot a heightmap (Z on X/Y axis) in 3D and 2D in matplotlib
Python
1
star
50

pyconuk2018_diagramatic_diagnosis_of_data

HTML
1
star
51

europython2018_keynote_notebook_demo

How heavy is puppy-Ada? A short demo of live data collection and analysis in a Jupyter Notebook
Jupyter Notebook
1
star
52

quantified_cat_flap

RaspberryPi code to read reed switches on a catflap
Python
1
star
53

networkx_example

Example of plotting with networkx and some interaction (mainly to help myself)
Python
1
star
54

python_exception_examples

Simple examples of python 2 (and 3) exceptions so I can stop reading the docs all the time
1
star
55

matplotlib_templates

Templates for things that I like
Jupyter Notebook
1
star