• Stars
    star
    251
  • Rank 161,862 (Top 4 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created almost 7 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

pure-Python HistFactory implementation with tensors and autodiff

pyhf logo

pure-python fitting/limit-setting/interval estimation HistFactory-style

GitHub Project DOI JOSS DOI Scikit-HEP NSF Award Number NumFOCUS Affiliated Project

Docs from latest Docs from main Jupyter Book tutorial Binder

PyPI version Conda-forge version Supported Python versions Docker Hub pyhf Docker Hub pyhf CUDA

Code Coverage CodeFactor pre-commit.ci status Code style: black

GitHub Actions Status: CI GitHub Actions Status: Docs GitHub Actions Status: Publish GitHub Actions Status: Docker

The HistFactory p.d.f. template [CERN-OPEN-2012-016] is per-se independent of its implementation in ROOT and sometimes, it’s useful to be able to run statistical analysis outside of ROOT, RooFit, RooStats framework.

This repo is a pure-python implementation of that statistical model for multi-bin histogram-based analysis and its interval estimation is based on the asymptotic formulas of “Asymptotic formulae for likelihood-based tests of new physics” [arXiv:1007.1727]. The aim is also to support modern computational graph libraries such as PyTorch and TensorFlow in order to make use of features such as autodifferentiation and GPU acceleration.

User Guide

For an in depth walkthrough of usage of the latest release of pyhf visit the pyhf tutorial.

Hello World

This is how you use the pyhf Python API to build a statistical model and run basic inference:

>>> import pyhf
>>> pyhf.set_backend("numpy")
>>> model = pyhf.simplemodels.uncorrelated_background(
...     signal=[12.0, 11.0], bkg=[50.0, 52.0], bkg_uncertainty=[3.0, 7.0]
... )
>>> data = [51, 48] + model.config.auxdata
>>> test_mu = 1.0
>>> CLs_obs, CLs_exp = pyhf.infer.hypotest(
...     test_mu, data, model, test_stat="qtilde", return_expected=True
... )
>>> print(f"Observed: {CLs_obs:.8f}, Expected: {CLs_exp:.8f}")
Observed: 0.05251497, Expected: 0.06445321

Alternatively the statistical model and observational data can be read from its serialized JSON representation (see next section).

>>> import pyhf
>>> import requests
>>> pyhf.set_backend("numpy")
>>> url = "https://raw.githubusercontent.com/scikit-hep/pyhf/main/docs/examples/json/2-bin_1-channel.json"
>>> wspace = pyhf.Workspace(requests.get(url).json())
>>> model = wspace.model()
>>> data = wspace.data(model)
>>> test_mu = 1.0
>>> CLs_obs, CLs_exp = pyhf.infer.hypotest(
...     test_mu, data, model, test_stat="qtilde", return_expected=True
... )
>>> print(f"Observed: {CLs_obs:.8f}, Expected: {CLs_exp:.8f}")
Observed: 0.35998409, Expected: 0.35998409

Finally, you can also use the command line interface that pyhf provides

$ cat << EOF  | tee likelihood.json | pyhf cls
{
    "channels": [
        { "name": "singlechannel",
          "samples": [
            { "name": "signal",
              "data": [12.0, 11.0],
              "modifiers": [ { "name": "mu", "type": "normfactor", "data": null} ]
            },
            { "name": "background",
              "data": [50.0, 52.0],
              "modifiers": [ {"name": "uncorr_bkguncrt", "type": "shapesys", "data": [3.0, 7.0]} ]
            }
          ]
        }
    ],
    "observations": [
        { "name": "singlechannel", "data": [51.0, 48.0] }
    ],
    "measurements": [
        { "name": "Measurement", "config": {"poi": "mu", "parameters": []} }
    ],
    "version": "1.0.0"
}
EOF

which should produce the following JSON output:

{
   "CLs_exp": [
      0.0026062609501074576,
      0.01382005356161206,
      0.06445320535890459,
      0.23525643861460702,
      0.573036205919389
   ],
   "CLs_obs": 0.05251497423736956
}

What does it support

Implemented variations:
  • HistoSys
  • OverallSys
  • ShapeSys
  • NormFactor
  • Multiple Channels
  • Import from XML + ROOT via uproot
  • ShapeFactor
  • StatError
  • Lumi Uncertainty
  • Non-asymptotic calculators
Computational Backends:
  • NumPy
  • PyTorch
  • TensorFlow
  • JAX
Optimizers:
  • SciPy (scipy.optimize)
  • MINUIT (iminuit)

All backends can be used in combination with all optimizers. Custom user backends and optimizers can be used as well.

Todo

  • ☐ StatConfig

results obtained from this package are validated against output computed from HistFactory workspaces

A one bin example

import pyhf
import numpy as np
import matplotlib.pyplot as plt
from pyhf.contrib.viz import brazil

pyhf.set_backend("numpy")
model = pyhf.simplemodels.uncorrelated_background(
    signal=[10.0], bkg=[50.0], bkg_uncertainty=[7.0]
)
data = [55.0] + model.config.auxdata

poi_vals = np.linspace(0, 5, 41)
results = [
    pyhf.infer.hypotest(
        test_poi, data, model, test_stat="qtilde", return_expected_set=True
    )
    for test_poi in poi_vals
]

fig, ax = plt.subplots()
fig.set_size_inches(7, 5)
brazil.plot_results(poi_vals, results, ax=ax)
fig.show()

pyhf

manual

ROOT

manual

A two bin example

import pyhf
import numpy as np
import matplotlib.pyplot as plt
from pyhf.contrib.viz import brazil

pyhf.set_backend("numpy")
model = pyhf.simplemodels.uncorrelated_background(
    signal=[30.0, 45.0], bkg=[100.0, 150.0], bkg_uncertainty=[15.0, 20.0]
)
data = [100.0, 145.0] + model.config.auxdata

poi_vals = np.linspace(0, 5, 41)
results = [
    pyhf.infer.hypotest(
        test_poi, data, model, test_stat="qtilde", return_expected_set=True
    )
    for test_poi in poi_vals
]

fig, ax = plt.subplots()
fig.set_size_inches(7, 5)
brazil.plot_results(poi_vals, results, ax=ax)
fig.show()

pyhf

manual

ROOT

manual

Installation

To install pyhf from PyPI with the NumPy backend run

python -m pip install pyhf

and to install pyhf with all additional backends run

python -m pip install pyhf[backends]

or a subset of the options.

To uninstall run

python -m pip uninstall pyhf

Documentation

For model specification, API reference, examples, and answers to FAQs visit the pyhf documentation.

Questions

If you have a question about the use of pyhf not covered in the documentation, please ask a question on the GitHub Discussions.

If you believe you have found a bug in pyhf, please report it in the GitHub Issues. If you're interested in getting updates from the pyhf dev team and release announcements you can join the pyhf-announcements mailing list.

Citation

As noted in Use and Citations, the preferred BibTeX entry for citation of pyhf includes both the Zenodo archive and the JOSS paper:

@software{pyhf,
  author = {Lukas Heinrich and Matthew Feickert and Giordon Stark},
  title = "{pyhf: v0.7.2}",
  version = {0.7.2},
  doi = {10.5281/zenodo.1169739},
  url = {https://doi.org/10.5281/zenodo.1169739},
  note = {https://github.com/scikit-hep/pyhf/releases/tag/v0.7.2}
}

@article{pyhf_joss,
  doi = {10.21105/joss.02823},
  url = {https://doi.org/10.21105/joss.02823},
  year = {2021},
  publisher = {The Open Journal},
  volume = {6},
  number = {58},
  pages = {2823},
  author = {Lukas Heinrich and Matthew Feickert and Giordon Stark and Kyle Cranmer},
  title = {pyhf: pure-Python implementation of HistFactory statistical models},
  journal = {Journal of Open Source Software}
}

Authors

pyhf is openly developed by Lukas Heinrich, Matthew Feickert, and Giordon Stark.

Please check the contribution statistics for a list of contributors.

Milestones

  • 2022-09-12: 2000 GitHub issues and pull requests. (See PR #2000)
  • 2021-12-09: 1000 commits to the project. (See PR #1710)
  • 2020-07-28: 1000 GitHub issues and pull requests. (See PR #1000)

Acknowledgements

Matthew Feickert has received support to work on pyhf provided by NSF cooperative agreement OAC-1836650 (IRIS-HEP) and grant OAC-1450377 (DIANA/HEP).

pyhf is a NumFOCUS Affiliated Project.

More Repositories

1

awkward

Manipulate JSON-like data with NumPy-like idioms.
Python
832
star
2

uproot3

ROOT I/O in pure Python and NumPy.
Python
315
star
3

iminuit

Jupyter-friendly Python interface for C++ MINUIT2
Python
280
star
4

uproot5

ROOT I/O in pure Python and NumPy.
Python
234
star
5

awkward-0.x

Manipulate arrays of complex data structures as easily as Numpy.
Python
215
star
6

mplhep

Extended histogram plotting on top of matplotlib and HEP collaboration compatible styling
Python
188
star
7

scikit-hep

Metapackage of Scikit-HEP project data analysis packages for Particle Physics.
Python
163
star
8

particle

Package to deal with particles, the PDG particle data table, PDGIDs, etc.
Python
149
star
9

boost-histogram

Python bindings for the C++14 Boost::Histogram library
Jupyter Notebook
143
star
10

root_numpy

The interface between ROOT and NumPy
Python
131
star
11

hist

Histogramming for analysis powered by boost-histogram
Python
127
star
12

root_pandas

A Python module for conveniently loading/saving ROOT files as pandas DataFrames
Python
109
star
13

histbook

Versatile, high-performance histogram toolkit for Numpy.
Jupyter Notebook
108
star
14

vector

Vector classes and utilities
Python
79
star
15

resample

Randomization-based inference in Python
Python
73
star
16

uproot-browser

A TUI viewer for ROOT files
Python
69
star
17

hepstats

Statistics tools and utilities.
Python
66
star
18

probfit

Cost function builder. For fitting distributions.
Jupyter Notebook
50
star
19

pylhe

Lightweight Python interface to read Les Houches Event (LHE) files
Python
39
star
20

decaylanguage

Package to parse decay files, describe and convert particle decays between digital representations.
Jupyter Notebook
38
star
21

vegascope

View Vega/Vega-Lite plots in your web browser from local or remote Python processes.
Python
36
star
22

numpythia

The interface between PYTHIA and NumPy
Cython
36
star
23

pyjet

The interface between FastJet and NumPy
C++
33
star
24

histoprint

Pretty print histograms to the console
Python
32
star
25

ragged

Manipulating ragged arrays in an Array API compliant way.
Python
29
star
26

cabinetry

design and steer profile likelihood fits
Python
25
star
27

fastjet

Jet-finding in the Scikit-HEP ecosystem.
Python
21
star
28

uproot3-methods

Pythonic behaviors for non-I/O related ROOT classes.
Python
21
star
29

hepunits

Units and constants in the HEP system of units
Python
21
star
30

pyhepmc

Easy-to-use Python bindings for HepMC3
Python
20
star
31

aghast

Aghast: aggregated, histogram-like statistics, sharable as Flatbuffers.
Python
17
star
32

scikit-hep-testdata

A common package to provide example files (e.g., ROOT) for testing and developing packages against.
C
13
star
33

formulate

Easy conversions between different styles of expressions
Python
12
star
34

scikit-hep.github.io

Pages defining the website of the Scikit-HEP project.
HTML
11
star
35

pyBumpHunter

Python implementation of the BumpHunter algorithm used by HEP community.
Jupyter Notebook
11
star
36

hepconvert

Python
11
star
37

uhi

Universal Histogram Interface
Python
9
star
38

scikit-hep-tutorials

Ecosystem tutorials, demos, examples
Jupyter Notebook
8
star
39

azure-wheel-helpers

Please use cibuildwheel instead!
Shell
8
star
40

NNDrone

Collection of tools and algorithms to enable conversion of HEP ML to mass usage model
Python
6
star
41

cuda-histogram

Histogramming tools on CUDA.
Python
6
star
42

scikit-hep-orgstats

Stats gathering tools for SciKit-HEP PyPI releases
Jupyter Notebook
3
star
43

manylinuxgcc

ManyLinux1 with modern GCC
Dockerfile
2
star
44

scikit-hep.github.io-source

Old sources for the Scikit-HEP org website pages.
Python
1
star