TALK
I just gave a talk about this at SCaLE 18x. Here are the video of the talk and the “slides”.
NAME
gnuplotlib: a gnuplot-based plotting backend for numpy
SYNOPSIS
import numpy as np
import gnuplotlib as gp
x = np.arange(101) - 50
gp.plot(x**2)
[ basic parabola plot pops up ]
g1 = gp.gnuplotlib(title = 'Parabola with error bars',
_with = 'xyerrorbars')
g1.plot( x**2 * 10, np.abs(x)/10, np.abs(x)*5,
legend = 'Parabola',
tuplesize = 4 )
[ parabola with x,y errobars pops up in a new window ]
x,y = np.ogrid[-10:11,-10:11]
gp.plot( x**2 + y**2,
title = 'Heat map',
unset = 'grid',
cmds = 'set view map',
_with = 'image',
tuplesize = 3)
[ Heat map pops up where first parabola used to be ]
theta = np.linspace(0, 6*np.pi, 200)
z = np.linspace(0, 5, 200)
g2 = gp.gnuplotlib(_3d = True)
g2.plot( np.cos(theta),
np.vstack((np.sin(theta), -np.sin(theta))),
z )
[ Two 3D spirals together in a new window ]
x = np.arange(1000)
gp.plot( (x*x, dict(histogram=1,
binwidth =10000)),
(x*x, dict(histogram='cumulative', y2=1)))
[ A density and cumulative histogram of x^2 are plotted on the same plot ]
gp.plot( (x*x, dict(histogram=1,
binwidth =10000)),
(x*x, dict(histogram='cumulative')),
_xmin=0, _xmax=1e6,
multiplot='title "multiplot histograms" layout 2,1',
_set='lmargin at screen 0.05')
[ Same histograms, but plotted on two separate plots ]
DESCRIPTION
For an introductory tutorial and some demos, please see the guide:
https://github.com/dkogan/gnuplotlib/blob/master/guide/guide.org
This module allows numpy data to be plotted using Gnuplot as a backend. As much as was possible, this module acts as a passive pass-through to Gnuplot, thus making available the full power and flexibility of the Gnuplot backend. Gnuplot is described in great detail at its upstream website: http://www.gnuplot.info
gnuplotlib has an object-oriented interface (via class gnuplotlib) and a few global class-less functions (plot(), plot3d(), plotimage()). Each instance of class gnuplotlib has a separate gnuplot process and a plot window. If multiple simultaneous plot windows are desired, create a separate class gnuplotlib object for each.
The global functions reuse a single global gnuplotlib instance, so each such invocation rewrites over the previous gnuplot window.
The object-oriented interface is used like this:
import gnuplotlib as gp
g = gp.gnuplotlib(options)
g.plot( curve, curve, .... )
The global functions consolidate this into a single call:
import gnuplotlib as gp
gp.plot( curve, curve, ...., options )
Option arguments
Each gnuplotlib object controls ONE gnuplot process. And each gnuplot process produces ONE plot window (or hardcopy) at a time. Each process usually produces ONE subplot at a time (unless we asked for a multiplot). And each subplot contains multiple datasets (referred to as “curves”).
These 3 objects (process, subplot, curve) are controlled by their own set of options, specified as a python dict. A FULL (much more verbose than you would ever be) non-multiplot plot command looks like
import gnuplotlib as gp
g = gp.gnuplotlib( subplot_options, process_options )
curve_options0 = dict(...)
curve_options1 = dict(...)
curve0 = (x0, y0, curve_options0)
curve1 = (x1, y1, curve_options1)
g.plot( curve0, curve1 )
and a FULL multiplot command wraps this once more:
import gnuplotlib as gp
g = gp.gnuplotlib( process_options, multiplot=... )
curve_options0 = dict(...)
curve_options1 = dict(...)
curve0 = (x0, y0, curve_options0)
curve1 = (x1, y1, curve_options1)
subplot_options0 = dict(...)
subplot0 = (curve0, curve1, subplot_options0)
curve_options2 = dict(...)
curve_options3 = dict(...)
curve2 = (x2, y2, curve_options2)
curve3 = (x3, y3, curve_options3)
subplot_options1 = dict(...)
subplot1 = (curve2, curve3, subplot_options1)
g.plot( subplot_options0, subplot_options1 )
This is verbose, and rarely will you actually specify everything in this much detail:
- Anywhere that expects process options, you can pass the DEFAULT subplot options and the DEFAULT curve options for all the children. These defaults may be overridden in the appropriate place
- Anywhere that expects plot options you can pass DEFAULT curve options for all the child curves. And these can be overridden also
- Broadcasting (see below) reduces the number of curves you have to explicitly specify
- Implicit domains (see below) reduce the number of numpy arrays you need to pass when specifying each curve
- If only a single curve tuple is to be plotted, it can be inlined
The following are all equivalent ways of making the same plot:
import gnuplotlib as gp
import numpy as np
x = np.arange(10)
y = x*x
# Global function. Non-inlined curves. Separate curve and subplot options
gp.plot( (x,y, dict(_with = 'lines')), title = 'parabola')
# Global function. Inlined curves (possible because we have only one curve).
# The curve, subplot options given together
gp.plot( x,y, _with = 'lines', title = 'parabola' )
# Object-oriented function. Non-inlined curves.
p1 = gp.gnuplotlib(title = 'parabola')
p1.plot((x,y, dict(_with = 'lines')),)
# Object-oriented function. Inlined curves.
p2 = gp.gnuplotlib(title = 'parabola')
p2.plot(x,y, _with = 'lines')
If multiple curves are to be drawn on the same plot, then each ‘curve’ must live in a separate tuple, or we can use broadcasting to stack the extra data in new numpy array dimensions. Identical ways to make the same plot:
import gnuplotlib as gp
import numpy as np
import numpysane as nps
x = np.arange(10)
y = x*x
z = x*x*x
# Object-oriented function. Separate curve and subplot options
p = gp.gnuplotlib(title = 'parabola and cubic')
p.plot((x,y, dict(_with = 'lines', legend = 'parabola')),
(x,z, dict(_with = 'lines', legend = 'cubic')))
# Global function. Separate curve and subplot options
gp.plot( (x,y, dict(_with = 'lines', legend = 'parabola')),
(x,z, dict(_with = 'lines', legend = 'cubic')),
title = 'parabola and cubic')
# Global function. Using the default _with
gp.plot( (x,y, dict(legend = 'parabola')),
(x,z, dict(legend = 'cubic')),
_with = 'lines',
title = 'parabola and cubic')
# Global function. Using the default _with, inlining the curve options, omitting
# the 'x' array, and using the implicit domain instead
gp.plot( (y, dict(legend = 'parabola')),
(z, dict(legend = 'cubic')),
_with = 'lines',
title = 'parabola and cubic')
# Global function. Using the default _with, inlining the curve options, omitting
# the 'x' array, and using the implicit domain instead. Using broadcasting for
# the data and for the legend, inlining the one curve
gp.plot( nps.cat(y,z),
legend = np.array(('parabola','cubic')),
_with = 'lines',
title = 'parabola and cubic')
When making a multiplot (see below) we have multiple subplots in a plot. For instance I can plot a sin() and a cos() on top of each other:
import gnuplotlib as gp
import numpy as np
th = np.linspace(0, np.pi*2, 30)
gp.plot( (th, np.cos(th), dict(title="cos")),
(th, np.sin(th), dict(title="sin")),
_xrange = [0,2.*np.pi],
_yrange = [-1,1],
multiplot='title "multiplot sin,cos" layout 2,1')
Process options are parameters that affect the whole plot window, like the output filename, whether to test each gnuplot command, etc. We have ONE set of process options for ALL the subplots. These are passed into the gnuplotlib constructor or appear as keyword arguments in a global plot() call. All of these are described below in “Process options”.
Subplot options are parameters that affect a subplot. Unless we’re multiplotting, there’s only one subplot, so we have a single set of process options and a single set of subplot options. Together these are sometimes referred to as “plot options”. Examples are the title of the plot, the axis labels, the extents, 2D/3D selection, etc. If we aren’t multiplotting, these are passed into the gnuplotlib constructor or appear as keyword arguments in a global plot() call. In a multiplot, these are passed as a python dict in the last element of each subplot tuple. Or the default values can be given where process options usually live. All of these are described below in “Subplot options”.
Curve options: parameters that affect only a single curve. These are given as a python dict in the last element of each curve tuple. Or the defaults can appear where process or subplot options are expected. Each is described below in “Curve options”.
A few helper global functions are available:
plot3d(…) is equivalent to plot(…, _3d=True)
plotimage(…) is equivalent to plot(…, _with=’image’, tuplesize=3)
Data arguments
The ‘curve’ arguments in the plot(…) argument list represent the actual data being plotted. Each output data point is a tuple (set of values, not a python “tuple”) whose size varies depending on what is being plotted. For example if we’re making a simple 2D x-y plot, each tuple has 2 values. If we’re making a 3D plot with each point having variable size and color, each tuple has 5 values: (x,y,z,size,color). When passing data to plot(), each tuple element is passed separately by default (unless we have a negative tuplesize; see below). So if we want to plot N 2D points we pass the two numpy arrays of shape (N,):
gp.plot( x,y )
By default, gnuplotlib assumes tuplesize==2 when plotting in 2D and tuplesize==3 when plotting in 3D. If we’re doing anything else, then the ‘tuplesize’ curve option MUST be passed in:
gp.plot( x,y,z,size,color,
tuplesize = 5,
_3d = True,
_with = 'points ps variable palette' )
This is required because you may be using implicit domains (see below) and/or broadcasting, so gnuplotlib has no way to know the intended tuplesize.
Broadcasting
Broadcasting is fully supported, so multiple curves can be plotted by stacking data inside the passed-in arrays. Broadcasting works across curve options also, so things like curve labels and styles can also be stacked inside arrays:
th = np.linspace(0, 6*np.pi, 200)
z = np.linspace(0, 5, 200)
size = 0.5 + np.abs(np.cos(th))
color = np.sin(2*th)
# without broadcasting:
gp.plot3d( ( np.cos(th), np.sin(th),
z, size, color,
dict(legend = 'spiral 1') ),
( -np.cos(th), -np.sin(th),
z, size, color,
dict(legend = 'spiral 2') ),
tuplesize = 5,
title = 'double helix',
_with = 'points pointsize variable pointtype 7 palette' )
# identical plot using broadcasting:
gp.plot3d( ( np.cos(th) * np.array([[1,-1]]).T,
np.sin(th) * np.array([[1,-1]]).T,
z, size, color,
dict( legend = np.array(('spiral 1', 'spiral 2')))),
tuplesize = 5,
title = 'double helix',
_with = 'points pointsize variable pointtype 7 palette' )
This is a 3D plot with variable size and color. There are 5 values in the tuple, which we specify. The first 2 arrays have shape (2,N); all the other arrays have shape (N,). Thus the broadcasting rules generate 2 distinct curves, with varying values for x,y and identical values for z, size and color. We label the curves differently by passing an array for the ‘legend’ curve option. This array contains strings, and is broadcast like everything else.
Negative tuplesize
If we have all the data elements in a single array, plotting them is a bit awkward. Here’re two ways:
xy = .... # Array of shape (N,2). Each slice is (x,y)
gp.plot(xy[:,0], xy[:,1])
gp.plot(*xy.T)
The *xy.T version is concise, but is only possible if we’re plotting one curve: python syntax doesn’t allow any arguments after and *-expanded tuple. With more than one curve you’re left with the first version, which is really verbose, especially with a large tuplesize. gnuplotlib handles this case with a shorthand: negative tuplesize. The above can be represented nicely like this:
gp.plot(xy, tuplesize = -2)
This means that each point has 2 values, but that instead of reading each one in a separate array, we have ONE array, with the values in the last dimension.
Implicit domains
gnuplotlib looks for tuplesize different arrays for each curve. It is common for the first few arrays to be predictable by gnuplotlib, and in those cases it’s a chore to require for the user to pass those in. Thus, if there are fewer than tuplesize arrays available, gnuplotlib will try to use an implicit domain. This happens if we are EXACTLY 1 or 2 arrays short (usually when making 2D and 3D plots respectively).
If exactly 1 dimension is missing, gnuplotlib will use np.arange(N) as the domain: we plot the given values in a row, one after another. Thus
gp.plot(np.array([1,5,3,4,4]))
is equivalent to
gp.plot(np.arange(5), np.array([1,5,3,4,4]) )
Only 1 array was given, but the default tuplesize is 2, so we are 1 array short.
If we are exactly 2 arrays short, gnuplotlib will use a 2D grid as a domain. Example:
xy = np.arange(21*21).reshape(21*21)
gp.plot( xy, _with = 'points', _3d=True)
Here the only given array has dimensions (21,21). This is a 3D plot, so we are exactly 2 arrays short. Thus, gnuplotlib generates an implicit domain, corresponding to a 21-by-21 grid. Note that in all other cases, each curve takes in tuplesize 1-dimensional arrays, while here it takes tuplesize-2 2-dimensional arrays.
Also, note that while the DEFAULT tuplesize depends on whether we’re making a 3D plot, once a tuplesize is given, the logic doesn’t care if a 3D plot is being made. It can make sense to have a 2D implicit domain when making 2D plots. For example, one can be plotting a color map:
x,y = np.ogrid[-10:11,-10:11]
gp.plot( x**2 + y**2,
title = 'Heat map',
set = 'view map',
_with = 'image',
tuplesize = 3)
Also note that the ‘tuplesize’ curve option is independent of implicit domains. This option specifies not how many data arrays we have, but how many values represent each data point. For example, if we want a 2D line plot with varying colors plotted with an implicit domain, set tuplesize=3 as before (x,y,color), but pass in only 2 arrays (y, color).
Multiplots
Usually each gnuplotlib object makes one plot at a time. And as a result, we have one set of process options and subplot options at a time (known together as “plot options”). Sometimes this isn’t enough, and we really want to draw multiple plots in a single window (or hardcopy) with a gnuplotlib.plot() call. This situation is called a “multiplot”. We enter this mode by passing a “multiplot” process option, which is a string passed directly to gnuplot in its “set multiplot …” command. See the corresponding gnuplot documentation for details:
gnuplot -e "help multiplot"
Normally we make plots like this:
gp.plot( (x0, y0, curve_options0),
(x1, y1, curve_options1),
...,
subplot_options, process_options)
In multiplot mode, the gnuplotlib.plot() command takes on one more level of indirection:
gp.plot( ( (x0, y0, curve_options0),
(x1, y1, curve_options1),
...
subplot_options0 ),
( (x2, y2, curve_options2),
(x3, y3, curve_options3),
...
subplot_options1 ),
...,
process_options )
The process options can appear at the end of the gp.plot() global call, or in the gnuplotlib() constructor. Subplot option and curve option defaults can appear there too. Subplot options and curve option defaults appear at the end of each subplot tuple.
A few options are valid as both process and subplot options: ‘cmds’, ‘set’, ‘unset’. If one of these (‘set’ for instance) is given as BOTH a process and subplot option, we execute BOTH of them. This is different from the normal behavior, where the outer option is treated as a default to be overridden, instead of contributed to.
Multiplot mode is useful, but has a number of limitations and quirks. For instance, interactive zooming, measuring isn’t possible. And since each subplot is independent, extra commands may be needed to align axes in different subplots: “help margin” in gnuplot to see how to do this. Do read the gnuplot docs in detail when touching any of this. Sample to plot two sinusoids above one another:
import gnuplotlib as gp
import numpy as np
th = np.linspace(0, np.pi*2, 30)
gp.plot( (th, np.cos(th), dict(title="cos")),
(th, np.sin(th), dict(title="sin")),
_xrange = [0,2.*np.pi],
_yrange = [-1,1],
multiplot='title "multiplot sin,cos" layout 2,1')
Symbolic equations
Gnuplot can plot both data and equations. This module exists largely for the data-plotting case, but sometimes it can be useful to plot equations together with some data. This is supported by the ‘equation…’ subplot option. This is either a string (for a single equation) or a list/tuple containing multiple strings for multiple equations. An example:
import numpy as np
import numpy.random as nr
import numpy.linalg
import gnuplotlib as gp
# generate data
x = np.arange(100)
c = np.array([1, 1800, -100, 0.8]) # coefficients
m = x[:, np.newaxis] ** np.arange(4) # 1, x, x**2, ...
noise = 1e4 * nr.random(x.shape)
y = np.dot( m, c) + noise # polynomial corrupted by noise
c_fit = np.dot(numpy.linalg.pinv(m), y) # coefficients obtained by a curve fit
# generate a string that describes the curve-fitted equation
fit_equation = '+'.join( '{} * {}'.format(c,m) for c,m in zip( c_fit.tolist(), ('x**0','x**1','x**2','x**3')))
# plot the data points and the fitted curve
gp.plot(x, y, _with='points', equation = fit_equation)
Here I generated some data, performed a curve fit to it, and plotted the data points together with the best-fitting curve. Here the best-fitting curve was plotted by gnuplot as an equation, so gnuplot was free to choose the proper sampling frequency. And as we zoom around the plot, the sampling frequency is adjusted to keep things looking nice.
Note that the various styles and options set by the other options do NOT apply to these equation plots. Instead, the string is passed to gnuplot directly, and any styling can be applied there. For instance, to plot a parabola with thick lines, you can issue
gp.plot( ....., equation = 'x**2 with lines linewidth 2')
As before, see the gnuplot documentation for details. You can do fancy things:
x = np.arange(100, dtype=float) / 100 * np.pi * 2;
c,s = np.cos(x), np.sin(x)
gp.plot( c,s,
square=1, _with='points',
set = ('parametric', 'trange [0:2*3.14]'),
equation = "sin(t),cos(t)" )
Here the data are points evently spaced around a unit circle. Along with these points we plot a unit circle as a parametric equation.
Histograms
It is possible to use gnuplot’s internal histogram support, which uses gnuplot to handle all the binning. A simple example:
x = np.arange(1000)
gp.plot( (x*x, dict(histogram = 'freq', binwidth=10000)),
(x*x, dict(histogram = 'cumulative', y2=1))
To use this, pass ‘histogram = HISTOGRAM_TYPE’ as a curve option. If the type is any non-string that evaluates to True, we use the ‘freq’ type: a basic frequency histogram. Otherwise, the types are whatever gnuplot supports. See the output of ‘help smooth’ in gnuplot. The most common types are
- freq: frequency
- cumulative: integral of freq. Runs from 0 to N, where N is the number of samples
- cnormal: like ‘cumulative’, but rescaled to run from 0 to 1
The ‘binwidth’ curve option specifies the size of the bins. This must match for ALL histogram curves in a plot. If omitted, this is assumed to be 1. As usual, the user can specify whatever styles they want using the ‘with’ curve option. If omitted, you get reasonable defaults: boxes for ‘freq’ histograms and lines for cumulative ones.
This only makes sense with 2D plots with tuplesize=1
Plot persistence and blocking
As currently written, gnuplotlib does NOT block and the plot windows do NOT persist. I.e.
- the ‘plot()’ functions return immediately, and the user interacts with the plot WHILE THE REST OF THE PYTHON PROGRAM IS RUNNING
- when the python program exits, the gnuplot process and any visible plots go away
If you want to write a program that just shows a plot, and exits when the user closes the plot window, you should do any of
- add wait=True to the process options dict
- call wait() on your gnuplotlib object
- call the global gnuplotlib.wait(), if you have a global plot
Please note that it’s not at all trivial to detect if a current plot window exists. If not, this function will end up waiting forever, and the user will need to Ctrl-C.
OPTIONS
Process options
The process options are a dictionary, passed as the keyword arguments to the global plot() function or to the gnuplotlib contructor. The supported keys of this dict are as follows:
- hardcopy, output
These are synonymous. Instead of drawing a plot on screen, plot into a file instead. The output filename is the value associated with this key. If the “terminal” plot option is given, that sets the output format; otherwise the output format is inferred from the filename. Currently only eps, ps, pdf, png, svg, gp are supported with some default sets of options. For any other formats you MUST provide the ‘terminal’ option as well. Example:
plot(..., hardcopy="plot.pdf")
[ Plots into that file ]
Note that the “.gp” format is special. Instead of asking gnuplot to make a plot using a specific terminal, writing to “xxx.gp” will create a self-plotting data file that is visualized with gnuplot.
- terminal
Selects the gnuplot terminal (backend). This determines how Gnuplot generates its output. Common terminals are ‘x11’, ‘qt’, ‘pdf’, ‘dumb’ and so on. See the Gnuplot docs for all the details.
There are several gnuplot terminals that are known to be interactive: “x11”, “qt” and so on. For these no “output” setting is desired. For noninteractive terminals (“pdf”, “dumb” and so on) the output will go to the file defined by the output/hardcopy key. If this plot option isn’t defined or set to the empty string, the output will be redirected to the standard output of the python process calling gnuplotlib.
>>> gp.plot( np.linspace(-5,5,30)**2, ... unset='grid', terminal='dumb 80 40' ) 25 A-+---------+-----------+-----------+----------+-----------+---------A-+ * + + + + + * + |* * | |* * | | * * | | A A | | * * | 20 +-+ * * +-+ | * * | | A A | | * * | | * * | | * * | | A A | 15 +-+ * * +-+ | * * | | * * | | A A | | * * | | * * | | A A | 10 +-+ * * +-+ | * * | | A A | | * * | | * * | | A A | | * * | 5 +-+ A A +-+ | * ** | | A** A | | * | | A* *A | | A* *A | + + + A** + *A* + + + 0 +-+---------+-----------+------A*A**A*A--------+-----------+---------+-+ 0 5 10 15 20 25 30
- set/unset
Either a string or a list/tuple; if given a list/tuple, each element is used in separate set/unset command. Example:
plot(..., set='grid', unset=['xtics', 'ytics])
[ turns on the grid, turns off the x and y axis tics ]
This is both a process and a subplot option. If both are given, BOTH are used, instead of the normal behavior of a subplot option overriding the process option
- cmds
Either a string or a list/tuple; if given a list/tuple, each element is used in separate command. Arbitrary extra commands to pass to gnuplot before the plots are created. These are passed directly to gnuplot, without any validation.
This is both a process and a subplot option. If both are given, BOTH are used, instead of the normal behavior of a subplot option overriding the process option
- dump
Used for debugging. If true, writes out the gnuplot commands to STDOUT instead of writing to a gnuplot process. Useful to see what commands would be sent to gnuplot. This is a dry run. Note that this dump will contain binary data unless ascii-only plotting is enabled (see below). This is also useful to generate gnuplot scripts since the dumped output can be sent to gnuplot later, manually if desired. Look at the ‘notest’ option for a less verbose dump.
- log
Used for debugging. If true, writes out the gnuplot commands and various progress logs to STDERR in addition to writing to a gnuplot process. This is NOT a dry run: data is sent to gnuplot AND to the log. Useful for debugging I/O issues. Note that this log will contain binary data unless ascii-only plotting is enabled (see below)
- ascii
If set, ASCII data is passed to gnuplot instead of binary data. Binary is the default because it is much more efficient (and thus faster). Any time you’re plotting something that isn’t just numbers (labels, time/date strings, etc) ascii communication is required instead. gnuplotlib tries to auto-detect when this is needed, but sometimes you do have to specify this manually.
- notest
Don’t check for failure after each gnuplot command. And don’t test all the plot options before creating the plot. This is generally only useful for debugging or for more sparse ‘dump’ functionality.
- wait
When we’re done asking gnuplot to make a plot, we ask gnuplot to tell us when the user closes the interactive plot window that popped up. The python process will block until the user is done looking at the data. This can also be achieved by calling the wait() gnuplotlib method or the global gnuplotlib.wait() function.
Subplot options
The subplot options are a dictionary, passed as the keyword arguments to the global plot() function or to the gnuplotlib contructor (when making single plots) or as the last element in each subplot tuple (when making multiplots). Default subplot options may be passed-in together with the process options. The supported keys of this dict are as follows:
- title
Specifies the title of the plot
- 3d
If true, a 3D plot is constructed. This changes the default tuple size from 2 to 3
- _3d
Identical to ‘3d’. In python, keyword argument keys cannot start with a number, so ‘_3d’ is accepted for that purpose. Same issue exists with with/_with
- set/unset
Either a string or a list/tuple; if given a list/tuple, each element is used in separate set/unset command. Example:
plot(..., set='grid', unset=['xtics', 'ytics])
[ turns on the grid, turns off the x and y axis tics ]
This is both a process and a subplot option. If both are given, BOTH are used, instead of the normal behavior of a subplot option overriding the process option
- cmds
Either a string or a list/tuple; if given a list/tuple, each element is used in separate command. Arbitrary extra commands to pass to gnuplot before the plots are created. These are passed directly to gnuplot, without any validation.
This is both a process and a subplot option. If both are given, BOTH are used, instead of the normal behavior of a subplot option overriding the process option
- with
If no ‘with’ curve option is given, use this as a default. See the description of the ‘with’ curve option for more detail
- _with
Identical to ‘with’. In python ‘with’ is a reserved word so it is illegal to use it as a keyword arg key, so ‘_with’ exists as an alias. Same issue exists with 3d/_3d
- square, square_xy, square-xy, squarexy
If True, these request a square aspect ratio. For 3D plots, square_xy plots with a square aspect ratio in x and y, but scales z. square_xy and square-xy and squarexy are synonyms. In 2D, these are all synonyms. Using any of these in 3D requires Gnuplot >= 4.4
- {x,y,y2,z,cb}{min,max,range,inv}
If given, these set the extents of the plot window for the requested axes. Either min/max or range can be given but not both. min/max are numerical values. ‘*range’ is a string ‘min:max’ with either one allowed to be omitted; it can also be a [min,max] tuple or list. ‘*inv’ is a boolean that reverses this axis. If the bounds are known, this can also be accomplished by setting max < min. Passing in both max < min AND inv also results in a reversed axis.
If no information about a range is given, it is not touched: the previous zoom settings are preserved.
The y2 axis is the secondary y-axis that is enabled by the ‘y2’ curve option. The ‘cb’ axis represents the color axis, used when color-coded plots are being generated
- xlabel, ylabel, zlabel, y2label, cblabel
These specify axis labels
- rgbimage
This should be set to a path containing an image file on disk. The data is then plotted on top of this image, which is very useful for annotations, computer vision, etc. Note that when plotting data, the y axis usually points up, but when looking at images, the y axis of the pixel coordinates points down instead. Thus, if the y axis extents aren’t given and an rgbimage IS specified, gnuplotlib will flip the y axis to make things look reasonable. If any y-axis ranges are given, however (with any of the ymin,ymax,yrange,yinv subplot options), then it is up to the user to flip the axis, if that’s what they want.
- equation, equation_above, equation_below
Either a string or a list/tuple; if given a list/tuple, each element is used in separate equation to plot. These options allows equations represented as formula strings to be plotted along with data passed in as numpy arrays. See the “Symbolic equations” section above.
By default, the equations are plotted BEFORE other data, so the data plotted later may obscure some of the equation. Depending on what we’re doing, this may or may not be what we want. To plot the equations AFTER other data, use ‘equation_above’ instead of ‘equation’. The ‘equation_below’ option is a synonym for ‘equation’
Curve options
The curve options describe details of specific curves. They are in a dict, whose keys are as follows:
- legend
Specifies the legend label for this curve
- with
Specifies the style for this curve. The value is passed to gnuplot using its ‘with’ keyword, so valid values are whatever gnuplot supports. Read the gnuplot documentation for the ‘with’ keyword for more information
- _with
Identical to ‘with’. In python ‘with’ is a reserved word so it is illegal to use it as a keyword arg key, so ‘_with’ exists as an alias
- y2
If true, requests that this curve be plotted on the y2 axis instead of the main y axis
- tuplesize
Described in the “Data arguments” section above. Specifies how many values represent each data point. For 2D plots this defaults to 2; for 3D plots this defaults to 3. These defaults are correct for simple plots. For each curve we expect to get tuplesize separate arrays of data unless any of these are true
- If tuplesize < 0, we expect to get a single numpy array, with each data tuple in the last dimension. See the “Negative tuplesize” section above for detail.
- If we receive fewer than tuplesize arrays, we may be using “Implicit domains”. See the “Implicit domains” section above for detail.
- using
Overrides the ‘using’ directive we pass to gnuplot. No error checking is performed, and the string is passed to gnuplot verbatim. This option is very rarely needed. The most common usage is to apply a function to an implicit domain. For instance, this basic command plots a line (linearly increasing values) against a linearly-increasing line number::
gp.plot(np.arange(100))
We can plot the same values against the square-root of the line number to get a parabola:
gp.plot(np.arange(100), using='(sqrt($1)):2')
- histogram
If given and if it evaluates to True, gnuplot will plot the histogram of this data instead of the data itself. See the “Histograms” section above for more details. If this curve option is a string, it’s expected to be one of the smoothing style gnuplot understands (see ‘help smooth’). Otherwise we assume the most common style: a frequency histogram. This only makes sense with 2D plots and tuplesize=1
- binwidth
Used for the histogram support. See the “Histograms” section above for more details. This sets the width of the histogram bins. If omitted, the width is set to 1.
INTERFACE
class gnuplotlib
A gnuplotlib object abstracts a gnuplot process and a plot window. A basic non-multiplot invocation:
import gnuplotlib as gp
g = gp.gnuplotlib(subplot_options, process_options)
g.plot( curve, curve, .... )
The subplot options are passed into the constructor; the curve options and the data are passed into the plot() method. One advantage of making plots this way is that there’s a gnuplot process associated with each gnuplotlib instance, so as long as the object exists, the plot will be interactive. Calling ‘g.plot()’ multiple times reuses the plot window instead of creating a new one.
global plot(…)
The convenience plotting routine in gnuplotlib. Invocation:
import gnuplotlib as gp
gp.plot( curve, curve, ...., subplot_and_default_curve_options )
Each ‘plot()’ call reuses the same window.
global plot3d(…)
Generates 3D plots. Shorthand for ‘plot(…, _3d=True)’
global plotimage(…)
Generates an image plot. Shorthand for ‘plot(…, _with=’image’, tuplesize=3)’
global wait(…)
Blocks until the user closes the interactive plot window. Useful for python applications that want blocking plotting behavior. This can also be achieved by calling the wait() gnuplotlib method or by adding wait=1 to the process options dict
RECIPES
Some very brief usage notes appear here. For a tutorial and more in-depth recipes, please see the guide:
https://github.com/dkogan/gnuplotlib/blob/master/guide/guide.org
2D plotting
If we’re plotting y-values sequentially (implicit domain), all you need is
plot(y)
If we also have a corresponding x domain, we can plot y vs. x with
plot(x, y)
Simple style control
To change line thickness:
plot(x,y, _with='lines linewidth 3')
To change point size and point type:
gp.plot(x,y, _with='points pointtype 4 pointsize 8')
Everything (like _with) feeds directly into Gnuplot, so look at the Gnuplot docs to know how to change thicknesses, styles and such.
Errorbars
To plot errorbars that show y +- 1, plotted with an implicit domain
plot( y, np.ones(y.shape), _with = 'yerrorbars', tuplesize = 3 )
Same with an explicit x domain:
plot( x, y, np.ones(y.shape), _with = 'yerrorbars', tuplesize = 3 )
Symmetric errorbars on both x and y. x +- 1, y +- 2:
plot( x, y, np.ones(x.shape), 2*np.ones(y.shape), _with = 'xyerrorbars', tuplesize = 4 )
To plot asymmetric errorbars that show the range y-1 to y+2 (note that here you must specify the actual errorbar-end positions, NOT just their deviations from the center; this is how Gnuplot does it)
plot( y, y - np.ones(y.shape), y + 2*np.ones(y.shape),
_with = 'yerrorbars', tuplesize = 4 )
More multi-value styles
Plotting with variable-size circles (size given in plot units, requires Gnuplot >= 4.4)
plot(x, y, radii,
_with = 'circles', tuplesize = 3)
Plotting with an variably-sized arbitrary point type (size given in multiples of the “default” point size)
plot(x, y, sizes,
_with = 'points pointtype 7 pointsize variable', tuplesize = 3 )
Color-coded points
plot(x, y, colors,
_with = 'points palette', tuplesize = 3 )
Variable-size AND color-coded circles. A Gnuplot (4.4.0) quirk makes it necessary to specify the color range here
plot(x, y, radii, colors,
cbmin = mincolor, cbmax = maxcolor,
_with = 'circles palette', tuplesize = 4 )
Broadcasting example
Let’s plot the Conchoids of de Sluze. The whole family of curves is generated all at once, and plotted all at once with broadcasting. Broadcasting is also used to generate the labels. Generally these would be strings, but here just printing the numerical value of the parameter is sufficient.
theta = np.linspace(0, 2*np.pi, 1000) # dim=( 1000,)
a = np.arange(-4,3)[:, np.newaxis] # dim=(7,1)
gp.plot( theta,
1./np.cos(theta) + a*np.cos(theta), # broadcasted. dim=(7,1000)
_with = 'lines',
set = 'polar',
square = True,
yrange = [-5,5],
legend = a.ravel() )
3D plotting
General style control works identically for 3D plots as in 2D plots.
To plot a set of 3D points, with a square aspect ratio (squareness requires Gnuplot >= 4.4):
plot3d(x, y, z, square = 1)
If xy is a 2D array, we can plot it as a height map on an implicit domain
plot3d(xy)
Ellipse and sphere plotted together, using broadcasting:
th = np.linspace(0, np.pi*2, 30)
ph = np.linspace(-np.pi/2, np.pi*2, 30)[:,np.newaxis]
x_3d = (np.cos(ph) * np.cos(th)) .ravel()
y_3d = (np.cos(ph) * np.sin(th)) .ravel()
z_3d = (np.sin(ph) * np.ones( th.shape )) .ravel()
gp.plot3d( (x_3d * np.array([[1,2]]).T,
y_3d * np.array([[1,2]]).T,
z_3d,
{ 'legend': np.array(('sphere', 'ellipse'))}),
title = 'sphere, ellipse',
square = True,
_with = 'points')
Image arrays plots can be plotted as a heat map:
x,y = np.ogrid[-10:11,-10:11]
gp.plot( x**2 + y**2,
title = 'Heat map',
set = 'view map',
_with = 'image',
tuplesize = 3)
Data plotted on top of an existing image. Useful for image annotations.
gp.plot( x, y,
title = 'Points on top of an image',
_with = 'points',
square = 1,
rgbimage = 'image.png')
Hardcopies
To send any plot to a file, instead of to the screen, one can simply do
plot(x, y,
hardcopy = 'output.pdf')
For common output formats, the gnuplot terminal is inferred the filename. If this isn’t possible or if we want to tightly control the output, the ‘terminal’ plot option can be given explicitly. For example to generate a PDF of a particular size with a particular font size for the text, one can do
plot(x, y,
terminal = 'pdfcairo solid color font ",10" size 11in,8.5in',
hardcopy = 'output.pdf')
This command is equivalent to the ‘hardcopy’ shorthand used previously, but the fonts and sizes have been changed.
If we write to a “.gp” file:
plot(x, y,
hardcopy = 'data.gp')
then instead of running gnuplot, we create a self-plotting file. gnuplot is invoked when we execute that file.
GLOBAL FUNCTIONS
plot()
A simple wrapper around class gnuplotlib
SYNOPSIS
>>> import numpy as np >>> import gnuplotlib as gp >>> x = np.linspace(-5,5,100) >>> gp.plot( x, np.sin(x) ) [ graphical plot pops up showing a simple sinusoid ] >>> gp.plot( (x, np.sin(x), {'with': 'boxes'}), ... (x, np.cos(x), {'legend': 'cosine'}), ... _with = 'lines', ... terminal = 'dumb 80,40', ... unset = 'grid') [ ascii plot printed on STDOUT] 1 +-+---------+----------+-----------+-----------+----------+---------+-+ + +|||+ + + +++++ +++|||+ + + | |||||+ + + +|||||| cosine +-----+ | 0.8 +-+ |||||| + + ++||||||+ +-+ | ||||||+ + ++||||||||+ | | ||||||| + ++||||||||| | | |||||||+ + ||||||||||| | 0.6 +-+ |||||||| + +||||||||||+ +-+ | ||||||||+ | ++||||||||||| | | ||||||||| + ||||||||||||| | 0.4 +-+ ||||||||| | ++||||||||||||+ +-+ | ||||||||| + +|||||||||||||| | | |||||||||+ + ||||||||||||||| | | ||||||||||+ | ++||||||||||||||+ + | 0.2 +-+ ||||||||||| + ||||||||||||||||| + +-+ | ||||||||||| | +||||||||||||||||+ | | | ||||||||||| + |||||||||||||||||| + | 0 +-+ +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +-+ | + ||||||||||||||||||+ | ++|||||||||| | | | +||||||||||||||||| + ||||||||||| | | + ++|||||||||||||||| | +|||||||||| | -0.2 +-+ + ||||||||||||||||| + ||||||||||| +-+ | | ++||||||||||||||+ | ++||||||||| | | + ||||||||||||||| + ++|||||||| | | | +|||||||||||||| + ||||||||| | -0.4 +-+ + ++||||||||||||+ | +|||||||| +-+ | + ||||||||||||| + ||||||||| | | | +|||||||||||+ + ++||||||| | -0.6 +-+ + ++|||||||||| | +||||||| +-+ | + ||||||||||| + ++|||||| | | + +|||||||||+ + ||||||| | | + ++|||||||| + +++||||| | -0.8 +-+ + + ++||||||+ + + +||||| +-+ | + + +|||||| + + ++|||| | + + + ++ ++|||++ + + ++ + + ++||| + -1 +-+---------+----------+-----------+-----------+----------+---------+-+ -6 -4 -2 0 2 4 6
DESCRIPTION
class gnuplotlib provides full power and flexibility, but for simple plots this wrapper is easier to use. plot() uses a global instance of class gnuplotlib, so only a single plot can be made by plot() at a time: the one plot window is reused.
Data is passed to plot() in exactly the same way as when using class gnuplotlib. The kwargs passed to this function are a combination of curve options and plot options. The curve options passed here are defaults for all the curves. Any specific options specified in each curve override the defaults given in the kwargs.
See the documentation for class gnuplotlib for full details.
plot3d()
A simple wrapper around class gnuplotlib to make 3d plots
SYNOPSIS
import numpy as np
import gnuplotlib as gp
th = np.linspace(0,10,1000)
x = np.cos(np.linspace(0,10,1000))
y = np.sin(np.linspace(0,10,1000))
gp.plot3d( x, y, th )
[ an interactive, graphical plot of a spiral pops up]
DESCRIPTION
class gnuplotlib provides full power and flexibility, but for simple 3d plots this wrapper is easier to use. plot3d() simply calls plot(…, _3d=True). See the documentation for plot() and class gnuplotlib for full details.
plotimage()
A simple wrapper around class gnuplotlib to plot image maps
SYNOPSIS
import numpy as np
import gnuplotlib as gp
x,y = np.ogrid[-10:11,-10:11]
gp.plotimage( x**2 + y**2,
title = 'Heat map')
DESCRIPTION
class gnuplotlib provides full power and flexibility, but for simple image-map plots this wrapper is easier to use. plotimage() simply calls plot(…, _with=’image’, tuplesize=3). See the documentation for plot() and class gnuplotlib for full details.
wait()
Waits until the open interactive plot window is closed
SYNOPSIS
import numpy as np
import gnuplotlib as gp
gp.plot(np.arange(5))
# interactive plot pops up
gp.wait()
# We get here when the user closes the plot window
DESCRIPTION
This applies to the global gnuplotlib object.
It’s not at all trivial to detect if a current plot window exists. If not, this function will end up waiting forever, and the user will need to Ctrl-C
add_plot_option()
Ingests new key/value pairs into an option dict
SYNOPSIS
# A baseline plot_options dict was given to us. We want to make the
# plot, but make sure to omit the legend key
gp.add_plot_option(plot_options, 'unset', 'key')
gp.plot(..., **plot_options)
DESCRIPTION
Given a plot_options dict we can easily add a new option with
plot_options[key] = value
This has several potential problems:
- If an option for this key already exists, the above will overwrite the old value instead of adding a NEW option
- All options may take a leading _ to avoid conflicting with Python reserved words (set, _set for instance). The above may unwittingly create a duplicate
- Some plot options support multiple values, which the simple call ignores completely
THIS function takes care of the _ in keys. And this function knows which keys support multiple values. If a duplicate is given, it will either raise an exception, or append to the existing list, as appropriate.
If the given key supports multiple values, they can be given in a single call, as a list or a tuple.
Multiple key/values can be given using keyword arguments.
ARGUMENTS
- d: the plot options dict we’re updating
- key: string. The key being set
- values: string (if setting a single value) or iterable (if setting multiple values)
- **kwargs: more key/value pairs to set. We set the key/value positional arguments first, and then move on to the kwargs
- overwrite: optional boolean that controls how we handle overwriting keys that do not accept multiple values. By default (overwrite is None), trying to set a key that is already set results in an exception. elif overwrite: we overwrite the previous values. elif not overwrite: we leave the previous value
COMPATIBILITY
Python 2 and Python 3 should both be supported. Please report a bug if either one doesn’t work.
REPOSITORY
https://github.com/dkogan/gnuplotlib
AUTHOR
Dima Kogan <[email protected]>
LICENSE AND COPYRIGHT
Copyright 2015-2020 Dima Kogan.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License (any version) as published by the Free Software Foundation