• Stars
    star
    481
  • Rank 91,384 (Top 2 %)
  • Language
    Python
  • License
    BSD 2-Clause "Sim...
  • Created about 13 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

⚑ Fast scatter density plots for Matplotlib ⚑

Azure Status Coverage Status

About

Plotting millions of points can be slow. Real slow... 😴

So why not use density maps? ⚑

The mpl-scatter-density mini-package provides functionality to make it easy to make your own scatter density maps, both for interactive and non-interactive use. Fast. The following animation shows real-time interactive use with 10 million points, but interactive performance is still good even with 100 million points (and more if you have enough RAM).

Demo of mpl-scatter-density with NY taxi data

When panning, the density map is shown at a lower resolution to keep things responsive (though this is customizable).

To install, simply do:

pip install mpl-scatter-density

This package requires Numpy, Matplotlib, and fast-histogram - these will be installed by pip if they are missing. Both Python 2.7 and Python 3.x are supported, and the package should work correctly on Linux, MacOS X, and Windows.

Usage

There are two main ways to use mpl-scatter-density, both of which are explained below.

scatter_density method

The easiest way to use this package is to simply import mpl_scatter_density, then create Matplotlib axes as usual but adding a projection='scatter_density' option (if your reaction is 'wait, what?', see here). This will return a ScatterDensityAxes instance that has a scatter_density method in addition to all the usual methods (scatter, plot, etc.).

import numpy as np
import mpl_scatter_density
import matplotlib.pyplot as plt

# Generate fake data

N = 10000000
x = np.random.normal(4, 2, N)
y = np.random.normal(3, 1, N)

# Make the plot - note that for the projection option to work, the
# mpl_scatter_density module has to be imported above.

fig = plt.figure()
ax = fig.add_subplot(1, 1, 1, projection='scatter_density')
ax.scatter_density(x, y)
ax.set_xlim(-5, 10)
ax.set_ylim(-5, 10)
fig.savefig('gaussian.png')

Which gives:

Result from the example script

The scatter_density method takes the same options as imshow (for example cmap, alpha, norm, etc.), but also takes the following optional arguments:

  • dpi: this is an integer that is used to determine the resolution of the density map. By default, this is 72, but you can change it as needed, or set it to None to use the default for the Matplotlib backend you are using.
  • downres_factor: this is an integer that is used to determine how much to downsample the density map when panning in interactive mode. Set this to 1 if you don't want any downsampling.
  • color: this can be set to any valid matplotlib color, and will be used to automatically make a monochromatic colormap based on this color. The colormap will fade to transparent, which means that this mode is ideal when showing multiple density maps together.

Here is an example of using the color option:

import numpy as np
import matplotlib.pyplot as plt
import mpl_scatter_density  # noqa

fig = plt.figure()
ax = fig.add_subplot(1, 1, 1, projection='scatter_density')

n = 10000000

x = np.random.normal(0.5, 0.3, n)
y = np.random.normal(0.5, 0.3, n)

ax.scatter_density(x, y, color='red')

x = np.random.normal(1.0, 0.2, n)
y = np.random.normal(0.6, 0.2, n)

ax.scatter_density(x, y, color='blue')

ax.set_xlim(-0.5, 1.5)
ax.set_ylim(-0.5, 1.5)

fig.savefig('double.png')

Which produces the following output:

Result from the example script

ScatterDensityArtist

If you are a more experienced Matplotlib user, you might want to use the ScatterDensityArtist directly (this is used behind the scenes in the above example). To use this, initialize the ScatterDensityArtist with the axes as first argument, followed by any arguments you would have passed to scatter_density above (you can also take a look at the docstring for ScatterDensityArtist). You should then add the artist to the axes:

from mpl_scatter_density import ScatterDensityArtist
a = ScatterDensityArtist(ax, x, y)
ax.add_artist(a)

Advanced

Non-linear stretches for high dynamic range plots

In some cases, your density map might have a high dynamic range, and you might therefore want to show the log of the counts rather than the counts. You can do this by passing a matplotlib.colors.Normalize object to the norm argument in the same wasy as for imshow. For example, the astropy package includes a nice framework for making such a Normalize object for different functions. The following example shows how to show the density map on a log scale:

import numpy as np
import mpl_scatter_density
import matplotlib.pyplot as plt

# Make the norm object to define the image stretch
from astropy.visualization import LogStretch
from astropy.visualization.mpl_normalize import ImageNormalize
norm = ImageNormalize(vmin=0., vmax=1000, stretch=LogStretch())

N = 10000000
x = np.random.normal(4, 2, N)
y = np.random.normal(3, 1, N)

fig = plt.figure()
ax = fig.add_subplot(1, 1, 1, projection='scatter_density')
ax.scatter_density(x, y, norm=norm)
ax.set_xlim(-5, 10)
ax.set_ylim(-5, 10)
fig.savefig('gaussian_log.png')

Which produces the following output:

Result from the example script

Adding a colorbar

You can show a colorbar in the same way as you would for an image - the following example shows how to do it:

import numpy as np
import mpl_scatter_density
import matplotlib.pyplot as plt

N = 10000000
x = np.random.normal(4, 2, N)
y = np.random.normal(3, 1, N)

fig = plt.figure()
ax = fig.add_subplot(1, 1, 1, projection='scatter_density')
density = ax.scatter_density(x, y)
ax.set_xlim(-5, 10)
ax.set_ylim(-5, 10)
fig.colorbar(density, label='Number of points per pixel')
fig.savefig('gaussian_colorbar.png')

Which produces the following output:

Result from the example script

Color-coding 'markers' with individual values

In the same way that a 1-D array of values can be passed to Matplotlib's scatter function/method, a 1-D array of values can be passed to scatter_density using the c= argument:

import numpy as np
import mpl_scatter_density
import matplotlib.pyplot as plt

N = 10000000
x = np.random.normal(4, 2, N)
y = np.random.normal(3, 1, N)
c = x - y + np.random.normal(0, 5, N)

fig = plt.figure()
ax = fig.add_subplot(1, 1, 1, projection='scatter_density')
ax.scatter_density(x, y, c=c, vmin=-10, vmax=+10, cmap=plt.cm.RdYlBu)
ax.set_xlim(-5, 13)
ax.set_ylim(-5, 11)
fig.savefig('gaussian_color_coded.png')

Which produces the following output:

Result from the example script

Note that to keep performance as good as possible, the values from the c attribute are averaged inside each pixel of the density map, then the colormap is applied. This is a little different to what scatter would converge to in the limit of many points (since in that case it would apply the color to all the markers than average the colors).

Q&A

Isn't this basically the same as datashader?

This follows the same ideas as datashader, but the aim of mpl-scatter-density is specifically to bring datashader-like functionality to Matplotlib users. Furthermore, mpl-scatter-density is intended to be very easy to install - for example it can be installed with pip. But if you have datashader installed and regularly use bokeh, mpl-scatter-density won't do much for you. Note that if you are interested in datashader and Matplotlib together, there is a work in progress (pull request) by @tacaswell to create a Matplotlib artist similar to that in this package but powered by datashader.

What about vaex?

Vaex is a powerful package to visualize large datasets on N-dimensional grids, and therefore has some functionality that overlaps with what is here. However, the aim of mpl-scatter-density is just to provide a lightweight solution to make it easy for users already using Matplotlib to add scatter density maps to their plots rather than provide a complete environment for data visualization. I highly recommend that you take a look at Vaex and determine which approach is right for you!

Why on earth have you defined scatter_density as a projection?

If you are a Matplotlib developer: I truly am sorry for distorting the intended purpose of projection 😊. But you have to admit that it's a pretty convenient way to have users get a custom Axes sub-class even if it has nothing to do with actual projection!

Where do you see this going?

There are a number of things we could add to this package, for example a way to plot density maps as contours, or a way to color code each point by a third quantity and have that reflected in the density map. If you have ideas, please open issues, and even better contribute a pull request! πŸ˜„

Can I contribute?

I'm glad you asked - of course you are very welcome to contribute! If you have some ideas, you can open issues or create a pull request directly. Even if you don't have time to contribute actual code changes, I would love to hear from you if you are having issues using this package.

[![Build Status](https://dev.azure.com/thomasrobitaille/mpl-scatter-density/_apis/build/status/astrofrog.mpl-scatter-density?branchName=master)](https://dev.azure.com/thomasrobitaille/mpl-scatter-density/_build/latest?definitionId=17&branchName=master)

Running tests

To run the tests, you will need pytest and the pytest-mpl plugin. You can then run the tests with:

pytest mpl_scatter_density --mpl

More Repositories

1

psrecord

Record the CPU and memory activity of a process πŸ“ˆ
Python
528
star
2

fortranlib

Collection of personal scientific routines in Fortran πŸ“–
Fortran
281
star
3

fast-histogram

⚑ Fast 1D and 2D histogram functions in Python ⚑
C
267
star
4

pypi-timemachine

Install packages with pip as if you were in the past!
Python
109
star
5

acknowledgment-generator

Easily generate acknowledgment sections for papers
JavaScript
40
star
6

numtraits

Sanity checking for numerical properties/traits πŸ”’
Python
36
star
7

wcsaxes

wcsaxes has been merged into astropy!
Python
22
star
8

py4sci

Python Programming for Scientists - Lecture notes
HTML
21
star
9

sedfitter

Python version of the SED fitter from Robitaille et al., 2007, ApJS 169 328
Python
20
star
10

pyavm

Pure-python AVM library
Python
19
star
11

autowheel

Automatically build wheels for packages released on PyPI
Python
15
star
12

python-qt-tutorial

Python Qt tutorial
Python
14
star
13

idlsave

IDLSave - a python module to read IDL 'save' files
Python
12
star
14

colormapize

Generate colormaps from images!
Python
10
star
15

voila-qt-app

Jupyter Notebook
8
star
16

genetic

Very simple parallel genetic algorithm code
Python
8
star
17

scientific-python-survey-2015

Results for the 2015 Scientific Python survey
7
star
18

multistatus

This is no longer needed since GitHub now has an official version of this!
Python
6
star
19

example-travis-conda

How to use Miniconda to install dependencies on Travis CI
5
star
20

python-montage

This package is deprecated - please see
Python
5
star
21

python4vienna

Python/Astropy course at the University of Vienna, June 1st-3rd 2015
Python
4
star
22

py4sci-notes

Python
4
star
23

git-workflows

Scripts used to perform various complex git actions
Shell
4
star
24

pieceofcake

a user-friendly cookiecutter wrapper 🍰 ❀️ πŸͺ
Python
3
star
25

auto_bibtex

Automatically produce BibTeX file for LaTeX manuscript using the NASA ADS database
Python
3
star
26

vtk_python_sandbox

Python
3
star
27

Astropy4MPIK

Astropy workshop for the Max-Planck-Institut fΓΌr Kernphysik
Python
3
star
28

macports-python

Installation instructions for Python using MacPorts
3
star
29

mpl_styles

Python
3
star
30

batchpr

Package in need of a better name to automate opening pull requests πŸ€–
Python
3
star
31

problem_set_7

Problem Set 7 for the course Python: Programming for Scientists
Python
3
star
32

astrodendro-deprecated

Computing Astronomical Dendrograms
Python
3
star
33

robo-ph

#dotastro hack
Python
3
star
34

empty_folders

Simple Automator app to find and trash empty folders
3
star
35

dasktropy

Jupyter Notebook
2
star
36

mpia_contributing

2
star
37

tox-timemachine

Python
2
star
38

calling-c-libraries-from-python

Experiments with linking to C libraries from Python
Python
2
star
39

vispy-multivol

MultiVolumeVisual class for Vispy that allows multiple volumes to be shown at the same time
Python
2
star
40

mining_acknowledgments

1
star
41

arxivminer

ArXiV miner
Python
1
star
42

fun-with-adsb

Scripts related to ADS-B data
Python
1
star
43

cube-viewer

Experiments with 3-d spectral cube viewing
Python
1
star
44

dotastro8-remote

Jupyter Notebook
1
star
45

image_format

Experimental: Understanding the CASA image format
Python
1
star
46

astropy_issue_stats

Statistics on open/closed Astropy issues
Python
1
star
47

sedfitter-legacy

Fortran/Legacy version of the SED fitter from Robitaille et al., 2007, ApJS 169 328
Fortran
1
star
48

generate-setup-cfg

Script to generate setup.cfg files
Python
1
star
49

python-intro

Jupyter Notebook
1
star
50

wheel-forge

1
star
51

editable-mpl-selectors

Experimental Matplotlib compatible selectors
Python
1
star
52

fractal

Fractal distribution of points
Python
1
star
53

astropy4cambridge

Astropy workshop at the University of Cambridge
Jupyter Notebook
1
star
54

py2app-experiments

Experiments with Py2App
Python
1
star
55

python4imprs

Python for IMPRS students
Python
1
star
56

astropy-graphs

Various graphs related to the Astropy project
Python
1
star
57

mpl_font_testing

Python
1
star
58

timecard

A simple Python + Dropbox based command-line timecard
Python
1
star
59

casa-astropy

Experimental code for linking CASA and Astropy
1
star
60

freetype_version_testing

Python
1
star
61

python-versions-survey

Survey conducted in November 2012 to find out about Scientific Python Installations
Python
1
star