• Stars
    star
    127
  • Rank 282,790 (Top 6 %)
  • Language
    Python
  • License
    BSD 3-Clause "New...
  • Created almost 5 years ago
  • Updated about 1 month ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Histogramming for analysis powered by boost-histogram

histogram

Hist

Actions Status Documentation Status pre-commit.ci status Code style: black

PyPI version Conda-Forge PyPI platforms DOI License

GitHub Discussion Gitter Binder Scikit-HEP

Hist is an analyst-friendly front-end for boost-histogram, designed for Python 3.7+ (3.6 users get version 2.4). See what's new.

Slideshow of features. See docs/banner_slides.md for text if the image is not readable.

Installation

You can install this library from PyPI with pip:

python3 -m pip install "hist[plot]"

If you do not need the plotting features, you can skip the [plot] extra.

Features

Hist currently provides everything boost-histogram provides, and the following enhancements:

  • Hist augments axes with names:

    • name= is a unique label describing each axis.
    • label= is an optional string that is used in plotting (defaults to name if not provided).
    • Indexing, projection, and more support named axes.
    • Experimental NamedHist is a Hist that disables most forms of positional access, forcing users to use only names.
  • The Hist class augments bh.Histogram with simpler construction:

    • flow=False is a fast way to turn off flow for the axes on construction.
    • Storages can be given by string.
    • storage= can be omitted, strings and storages can be positional.
    • data= can initialize a histogram with existing data.
    • Hist.from_columns can be used to initialize with a DataFrame or dict.
    • You can cast back and forth with boost-histogram (or any other extensions).
  • Hist support QuickConstruct, an import-free construction system that does not require extra imports:

    • Use Hist.new.<axis>().<axis>().<storage>().
    • Axes names can be full (Regular) or short (Reg).
    • Histogram arguments (like data=) can go in the storage.
  • Extended Histogram features:

    • Direct support for .name and .label, like axes.
    • .density() computes the density as an array.
    • .profile(remove_ax) can convert a ND COUNT histogram into a (N-1)D MEAN histogram.
    • .sort(axis) supports sorting a histogram by a categorical axis. Optionally takes a function to sort by.
  • Hist implements UHI+; an extension to the UHI (Unified Histogram Indexing) system designed for import-free interactivity:

    • Uses j suffix to switch to data coordinates in access or slices.
    • Uses j suffix on slices to rebin.
    • Strings can be used directly to index into string category axes.
  • Quick plotting routines encourage exploration:

    • .plot() provides 1D and 2D plots (or use plot1d(), plot2d())
    • .plot2d_full() shows 1D projects around a 2D plot.
    • .plot_ratio(...) make a ratio plot between the histogram and another histogram or callable.
    • .plot_pull(...) performs a pull plot.
    • .plot_pie() makes a pie plot.
    • .show() provides a nice str printout using Histoprint.
  • Stacks: work with groups of histograms with identical axes

    • Stacks can be created with h.stack(axis), using index or name of an axis (StrCategory axes ideal).
    • You can also create with hist.stacks.Stack(h1, h2, ...), or use from_iter or from_dict.
    • You can index a stack, and set an entry with a matching histogram.
    • Stacks support .plot() and .show(), with names (plot labels default to original axes info).
    • Stacks pass through .project, *, +, and -.
  • New modules

    • intervals supports frequentist coverage intervals.
  • Notebook ready: Hist has gorgeous in-notebook representation.

    • No dependencies required

Usage

from hist import Hist

# Quick construction, no other imports needed:
h = (
    Hist.new.Reg(10, 0, 1, name="x", label="x-axis")
    .Var(range(10), name="y", label="y-axis")
    .Int64()
)

# Filling by names is allowed:
h.fill(y=[1, 4, 6], x=[3, 5, 2])

# Names can be used to manipulate the histogram:
h.project("x")
h[{"y": 0.5j + 3, "x": 5j}]

# You can access data coordinates or rebin with a `j` suffix:
h[0.3j:, ::2j]  # x from .3 to the end, y is rebinned by 2

# Elegant plotting functions:
h.plot()
h.plot2d_full()
h.plot_pull(Callable)

Development

From a git checkout, either use nox, or run:

python -m pip install -e .[dev]

See Contributing guidelines for information on setting up a development environment.

Contributors

We would like to acknowledge the contributors that made this project possible (emoji key):


Henry Schreiner

🚧 💻 📖

Nino Lau

🚧 💻 📖

Chris Burr

💻

Nick Amin

💻

Eduardo Rodrigues

💻

Andrzej Novak

💻

Matthew Feickert

💻

Kyle Cranmer

📖

Daniel Antrim

💻

Nicholas Smith

💻

Michael Eliachevitch

💻

Jonas Eschle

📖

This project follows the all-contributors specification.

Talks


Acknowledgements

This library was primarily developed by Henry Schreiner and Nino Lau.

Support for this work was provided by the National Science Foundation cooperative agreement OAC-1836650 (IRIS-HEP) and OAC-1450377 (DIANA/HEP). Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

More Repositories

1

awkward

Manipulate JSON-like data with NumPy-like idioms.
Python
832
star
2

uproot3

ROOT I/O in pure Python and NumPy.
Python
315
star
3

iminuit

Jupyter-friendly Python interface for C++ MINUIT2
Python
280
star
4

pyhf

pure-Python HistFactory implementation with tensors and autodiff
Python
251
star
5

uproot5

ROOT I/O in pure Python and NumPy.
Python
234
star
6

awkward-0.x

Manipulate arrays of complex data structures as easily as Numpy.
Python
215
star
7

mplhep

Extended histogram plotting on top of matplotlib and HEP collaboration compatible styling
Python
188
star
8

scikit-hep

Metapackage of Scikit-HEP project data analysis packages for Particle Physics.
Python
163
star
9

particle

Package to deal with particles, the PDG particle data table, PDGIDs, etc.
Python
149
star
10

boost-histogram

Python bindings for the C++14 Boost::Histogram library
Jupyter Notebook
143
star
11

root_numpy

The interface between ROOT and NumPy
Python
131
star
12

root_pandas

A Python module for conveniently loading/saving ROOT files as pandas DataFrames
Python
109
star
13

histbook

Versatile, high-performance histogram toolkit for Numpy.
Jupyter Notebook
108
star
14

vector

Vector classes and utilities
Python
79
star
15

resample

Randomization-based inference in Python
Python
73
star
16

uproot-browser

A TUI viewer for ROOT files
Python
69
star
17

hepstats

Statistics tools and utilities.
Python
66
star
18

probfit

Cost function builder. For fitting distributions.
Jupyter Notebook
50
star
19

pylhe

Lightweight Python interface to read Les Houches Event (LHE) files
Python
39
star
20

decaylanguage

Package to parse decay files, describe and convert particle decays between digital representations.
Jupyter Notebook
38
star
21

vegascope

View Vega/Vega-Lite plots in your web browser from local or remote Python processes.
Python
36
star
22

numpythia

The interface between PYTHIA and NumPy
Cython
36
star
23

pyjet

The interface between FastJet and NumPy
C++
33
star
24

histoprint

Pretty print histograms to the console
Python
32
star
25

ragged

Manipulating ragged arrays in an Array API compliant way.
Python
29
star
26

cabinetry

design and steer profile likelihood fits
Python
25
star
27

fastjet

Jet-finding in the Scikit-HEP ecosystem.
Python
21
star
28

uproot3-methods

Pythonic behaviors for non-I/O related ROOT classes.
Python
21
star
29

hepunits

Units and constants in the HEP system of units
Python
21
star
30

pyhepmc

Easy-to-use Python bindings for HepMC3
Python
20
star
31

aghast

Aghast: aggregated, histogram-like statistics, sharable as Flatbuffers.
Python
17
star
32

scikit-hep-testdata

A common package to provide example files (e.g., ROOT) for testing and developing packages against.
C
13
star
33

formulate

Easy conversions between different styles of expressions
Python
12
star
34

scikit-hep.github.io

Pages defining the website of the Scikit-HEP project.
HTML
11
star
35

pyBumpHunter

Python implementation of the BumpHunter algorithm used by HEP community.
Jupyter Notebook
11
star
36

hepconvert

Python
11
star
37

uhi

Universal Histogram Interface
Python
9
star
38

scikit-hep-tutorials

Ecosystem tutorials, demos, examples
Jupyter Notebook
8
star
39

azure-wheel-helpers

Please use cibuildwheel instead!
Shell
8
star
40

NNDrone

Collection of tools and algorithms to enable conversion of HEP ML to mass usage model
Python
6
star
41

cuda-histogram

Histogramming tools on CUDA.
Python
6
star
42

scikit-hep-orgstats

Stats gathering tools for SciKit-HEP PyPI releases
Jupyter Notebook
3
star
43

manylinuxgcc

ManyLinux1 with modern GCC
Dockerfile
2
star
44

scikit-hep.github.io-source

Old sources for the Scikit-HEP org website pages.
Python
1
star