• Stars
    star
    113
  • Rank 310,115 (Top 7 %)
  • Language
    Python
  • License
    BSD 3-Clause Clea...
  • Created almost 8 years ago
  • Updated 7 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

ABCpy package

ABCpy Documentation Status Build Status codecov DOI GitHub license PyPI version shields.io PyPI pyversions Binder

ABCpy is a scientific library written in Python for Bayesian uncertainty quantification in absence of likelihood function, which parallelizes existing approximate Bayesian computation (ABC) algorithms and other likelihood-free inference schemes.

Content

ABCpy presently includes the following ABC algorithms:

The above can be used with the following distances:

Moreover, we provide the following methods for directly approximating the likelihood functions:

The above likelihood approximation methods can be used with the following samplers:

Additional features are:

ABCpy addresses the needs of domain scientists and data scientists by providing

  • a fully modularized framework that is easy to use and easy to extend,
  • a quick way to integrate your generative model into the framework (from C++, R etc.) and
  • a non-intrusive, user-friendly way to parallelize inference computations (for your laptop to clusters, supercomputers and AWS)
  • an intuitive way to perform inference on hierarchical models or more generally on Bayesian networks

Documentation

For more information, check out the

Further, we provide a collection of models for which ABCpy has been applied successfully. This is a good place to look at more complicated inference setups.

Quick installation and requirements

ABCpy can be installed from pip:

pip install abcpy

Check here for more details.

Basic requirements are listed in requirements.txt. That also includes packages required for MPI parallelization there, which is very often used. However, we also provide support for parallelization with Apache Spark (see below).

Additional packages are required for additional features:

  • torch is needed in order to use neural networks to learn summary statistics. It can be installed by running pip install -r requirements/neural_networks_requirements.txt
  • In order to use Apache Spark for parallelization, findspark and pyspark are required; install them by pip install -r requirements/backend-spark.txt

Troubleshooting mpi4py installation

mpi4py requires a working MPI implementation to be installed; check the official docs for more info. On Ubuntu, that can be installed with:

sudo apt-get install libopenmpi-dev

Even when that is present, running pip install mpi4py can sometimes lead to errors. In fact, as specified in the official docs, the mpicc compiler needs to be in the search path. If that is not the case, a workaround is:

env MPICC=/path/to/mpicc pip install mpi4py

In some cases, even the above may not be enough. A possibility is using conda (conda install mpi4py) which usually handles package dependencies better than pip. Alternatively, you can try by installing directly mpi4py from the package manager; in Ubuntu, you can do:

sudo apt install python3-mpi4py 

which however does not work with virtual environments.

Author

ABCpy was written by Ritabrata Dutta, Warwick University and Marcel Schoengens, CSCS, ETH Zurich, and presently actively maintained by Lorenzo Pacchiardi, Oxford University and Ritabrata Dutta, Warwick University. Please feel free to submit any bugs or feature requests. We'd also love to hear about your experiences with ABCpy in general. Drop us an email!

We want to thank Prof. Antonietta Mira, Università della svizzera italiana, and Prof. Jukka-Pekka Onnela, Harvard University for helpful contributions and advice; Avinash Ummadisinghu and Nicole Widmern respectively for developing dynamic-MPI backend and making ABCpy suitable for hierarchical models; and finally CSCS (Swiss National Super Computing Center) for their generous support.

Citation

There is a paper in the Journal of Statistical Software. In case you use ABCpy for your publication, we would appreciate a citation. You can use this BibTex reference.

Other References

Publications in which ABCpy was applied:

  • L. Pacchiardi, R. Dutta. "Generalized Bayesian Likelihood-Free Inference Using Scoring Rules Estimators", 2021, arXiv:2104.03889.

  • L. Pacchiardi, R. Dutta. "Score Matched Conditional Exponential Families for Likelihood-Free Inference", 2022, Journal of Machine Learning Research 23(38):1−71.

  • R. Dutta, K. Zouaoui-Boudjeltia, C. Kotsalos, A. Rousseau, D. Ribeiro de Sousa, J. M. Desmet, A. Van Meerhaeghe, A. Mira, and B. Chopard. "Interpretable pathological test for Cardio-vascular disease: Approximate Bayesian computation with distance learning.", 2020, arXiv:2010.06465.

  • R. Dutta, S. Gomes, D. Kalise, L. Pacchiardi. "Using mobility data in the design of optimal lockdown strategies for the COVID-19 pandemic in England.", 2021, PLOS Computational Biology, 17(8), e1009236.

  • L. Pacchiardi, P. Künzli, M. Schöngens, B. Chopard, R. Dutta, "Distance-Learning for Approximate Bayesian Computation to Model a Volcanic Eruption", 2021, Sankhya B, 83(1), 288-317.

  • R. Dutta, J. P. Onnela, A. Mira, "Bayesian Inference of Spreading Processes on Networks", 2018, Proceedings of Royal Society A, 474(2215), 20180129.

  • R. Dutta, Z. Faidon Brotzakis and A. Mira, "Bayesian Calibration of Force-fields from Experimental Data: TIP4P Water", 2018, Journal of Chemical Physics 149, 154110.

  • R. Dutta, B. Chopard, J. Lätt, F. Dubois, K. Zouaoui Boudjeltia and A. Mira, "Parameter Estimation of Platelets Deposition: Approximate Bayesian Computation with High Performance Computing", 2018, Frontiers in physiology, 9.

  • A. Ebert, R. Dutta, K. Mengersen, A. Mira, F. Ruggeri and P. Wu, "Likelihood-free parameter estimation for dynamic queueing networks: case study of passenger flow in an international airport terminal", 2021, Journal of Royal Statistical Society: Series C (Applied Statistics) 70.3: 770-792.

License

ABCpy is published under the BSD 3-clause license, see here.

Contribute

You are very welcome to contribute to ABCpy.

If you want to contribute code, there are a few things to consider:

More Repositories

1

COSMA

Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm
C++
192
star
2

ImplicitGlobalGrid.jl

Almost trivial distributed parallelization of stencil-based GPU and CPU applications on a regular staggered grid
Julia
162
star
3

sarus

OCI-compatible engine to deploy Linux containers on HPC environments.
C++
129
star
4

PythonHPC

PythonHPC
Jupyter Notebook
110
star
5

DLA-Future

DLA-Future
C++
64
star
6

production

General interest repository for CSCS users
Python
49
star
7

SpFFT

Sparse 3D FFT library with MPI, OpenMP, CUDA and ROCm support
C++
48
star
8

firecrest

Python
33
star
9

SummerSchool2021

PostScript
32
star
10

spla

Specialized Parallel Linear Algebra, providing distributed GEMM functionality for specific matrix distributions with optional GPU acceleration.
C++
27
star
11

SummerSchool2020

Jupyter Notebook
26
star
12

SummerSchool2019

CSCS HPC Summer School 2019
Jupyter Notebook
25
star
13

spack-batteries-included

Installing spack without system dependencies
C
25
star
14

examples_cpp

Examples of designs using C++11/14
C++
25
star
15

SummerUniversity2022

PostScript
25
star
16

SummerUniversity2024

C++
24
star
17

Tiled-MM

Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.
C++
22
star
18

stackinator

Python
18
star
19

sshservice-cli

Shell
17
star
20

pytorch-training

PyTorch training at CSCS
Jupyter Notebook
15
star
21

manta

Another CLI for Alps
Rust
14
star
22

pascal-training

Teaching materials, slides and exercises, for the GPU & CUDA training in 2017
Cuda
13
star
23

cpp-course-2023

C++
13
star
24

conflux

Distributed Communication-Optimal LU-factorization Algorithm
C++
12
star
25

tensorflow-training

Multi-GPU training with TensorFlow on Piz Daint
Jupyter Notebook
12
star
26

abcpy-models

Python
11
star
27

COSTA

Distributed Communication-Optimal Shuffle and Transpose Algorithm
C++
11
star
28

ext_mpi_collectives

ext_mpi_collectives
C
10
star
29

SummerSchool2016

C++
10
star
30

pyfirecrest

Python wrappers for the FirecREST API
Python
10
star
31

SummerUniversity2023

C++
10
star
32

gpu-training

C++
10
star
33

SummerSchool2018

CSCS HPC Summer School 2018
C++
10
star
34

cmake-recipes

Repository for collecting, curating and maintaining up to date CMake scripts.
CMake
9
star
35

uenv

https://eth-cscs.github.io/uenv/
Python
9
star
36

squashfs-mount

Setuid instead of FUSE for mounting squashfs files.
C
9
star
37

cscs-reframe-tests

The CSCS ReFrame test suite
Python
8
star
38

openstack

Shell
8
star
39

alps-uenv

Recipes for software stacks on Alps vClusters.
Python
8
star
40

UserLabDay

CSCS User Lab Day – Meet the Swiss National Supercomputing Centre
Jupyter Notebook
8
star
41

slurm-container

Shell
7
star
42

alps-cluster-config

Python
7
star
43

cscs_beamer_style

TeX
6
star
44

ContainerHackathon

Jupyter Notebook
6
star
45

slurm-replay

Replay job submissions for Slurm
C
6
star
46

node-burn

C++
6
star
47

SummerSchool2015

Repository for summer school information that will be provided to students
C++
6
star
48

SummerSchool2017

C++
5
star
49

interactive

Interactive Computing with Jupyter on Piz Daint, using Python, ParaView and Julia
Jupyter Notebook
5
star
50

slurm-uenv-mount

C++
5
star
51

DLA-Future-Fortran

Fortran interface for DLA-Future
Fortran
5
star
52

py2spack

Automatic conversion of standard Python packages to Spack package recipes.
Python
4
star
53

hpctools

Debugging and Performance Tools examples
Python
4
star
54

tools

CSCS tools including R, python, netcdf, etc...
Python
4
star
55

ipcluster_magic

Magic commands to support running MPI python code as well as multi-node Dask workloads on Jupyter notebooks.
Python
4
star
56

async-encfs-dvc

Data version control in privacy-preserving HPC workflows using DVC, EncFS, SLURM and Openstack Swift on https://castor.cscs.ch
Jupyter Notebook
4
star
57

SDSC-user-onboarding

Materials for the onboarding workshop for data scientists at SDSC
Jupyter Notebook
4
star
58

whip

whip is a small C++ abstraction layer for CUDA and HIP
CMake
4
star
59

PASC_inference

MATLAB
4
star
60

compression

C++
4
star
61

DLA-interface

Interface for Distributed Linear Algebra
C++
3
star
62

squashfs-run

Mount directories directly under `/` without sudo, using bwrap (and without overlayfs)
C
3
star
63

mchquickstart

Introduction for new MCH users
C
3
star
64

TensorFlow

Python
3
star
65

containers-hands-on

Material for tutorials and hands-on about containers
Dockerfile
3
star
66

dynamic-resource-provisioning

Ansible-powered Dynamic Storage Resource Provisioning (DSRP)
Go
2
star
67

mesa

CSM library
Rust
2
star
68

firecrest-training-2023

Python
2
star
69

DataWeaver.jl

Julia
2
star
70

spack-stack

fast spack builds on slow filesystem
Python
2
star
71

cineca-cuda

teaching materials for the CUDA @ CINECA Feb. 2016
Cuda
2
star
72

irpf90

IRPF90 is a Fortran90 preprocessor written in Python for programming using the Implicit Reference to Parameters (IRP) method. It simplifies the development of large fortran codes in the field of scientific high performance computing.
Python
2
star
73

uenv2

C++
1
star
74

benchmark-resources

benchmark_resources: input files
1
star
75

cpp-course-2024

Slides for an internal C++ course at CSCS
HTML
1
star
76

SoftwareManagementCourse2019

Exercises and slides for the Software Management Course 2019 held at CSCS
CMake
1
star
77

firecrestspawner

A JupyterHub spawner to launch notebooks servers via FirecREST.
Python
1
star
78

alps-gh200-reproducers

Reproducers for issues found on GH200 nodes on Alps
C++
1
star