• Stars
    star
    410
  • Rank 105,468 (Top 3 %)
  • Language
    Jupyter Notebook
  • License
    Other
  • Created almost 5 years ago
  • Updated 12 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Python library for learning the graphical structure of Bayesian networks, parameter learning, inference and sampling methods.

bnlearn - Library for Bayesian network learning and inference

Python PyPI Version GitHub Repo stars License Forks Open Issues Project Status Downloads Downloads DOI Docs Medium GitHub repo size Donate Colab

bnlearn is Python package for learning the graphical structure of Bayesian networks, parameter learning, inference and sampling methods. Because probabilistic graphical models can be difficult in usage, Bnlearn for python (this package) is build on the pgmpy package and contains the most-wanted pipelines. Navigate to API documentations for more detailed information.

⭐️ Star this repo if you like it ⭐️

Read the Medium blog for more details.


Documentation pages

On the documentation pages you can find detailed information about the working of the bnlearn with many examples.

Installation

It is advisable to create a new environment (e.g. with Conda).
conda create -n env_bnlearn python=3.10
conda activate env_bnlearn
Install bnlearn from PyPI
pip install bnlearn
Install bnlearn from github source
pip install git+https://github.com/erdogant/bnlearn
The following functions are available after installation:
# Import library
import bnlearn as bn

# Structure learning
bn.structure_learning.fit()

# Compute edge strength with the test statistic
bn.independence_test(model, df, test='chi_square', prune=True)

# Parameter learning
bn.parameter_learning.fit()

# Inference
bn.inference.fit()

# Make predictions
bn.predict()

# Based on a DAG, you can sample the number of samples you want.
bn.sampling()

# Load well known examples to play arround with or load your own .bif file.
bn.import_DAG()

# Load simple dataframe of sprinkler dataset.
bn.import_example()

# Compare 2 graphs
bn.compare_networks()

# Plot graph
bn.plot()

# To make the directed grapyh undirected
bn.to_undirected()

# Convert to one-hot datamatrix
bn.df2onehot()

# Derive the topological ordering of the (entire) graph 
bn.topological_sort()

# See below for the exact working of the functions
The following methods are also included:
  • inference
  • sampling
  • comparing two networks
  • loading bif files
  • conversion of directed to undirected graphs

Method overview

Learning a Bayesian network can be split into the underneath problems which are all implemented in this package for both discrete, continous and mixed data sets:

  • Structure learning: Given the data: Estimate a DAG that captures the dependencies between the variables.

    • There are multiple manners to perform structure learning.
      • Exhaustivesearch
      • Hillclimbsearch
      • NaiveBayes
      • TreeSearch
        • Chow-liu
        • Tree-augmented Naive Bayes (TAN)
  • Parameter learning: Given the data and DAG: Estimate the (conditional) probability distributions of the individual variables.

  • Inference: Given the learned model: Determine the exact probability values for your queries.

Examples

A structured overview of all examples are now available on the documentation pages.

Structure learning
Parameter learning
Inferences
Sampling
Complete examples
Plotting
Various

Various basic examples

    import bnlearn as bn
    # Example dataframe sprinkler_data.csv can be loaded with: 
    df = bn.import_example()
    # df = pd.read_csv('sprinkler_data.csv')
df looks like this
Cloudy  Sprinkler  Rain  Wet_Grass
0         0          1     0          1
1         1          1     1          1
2         1          0     1          1
3         0          0     1          1
4         1          0     1          1
..      ...        ...   ...        ...
995       0          0     0          0
996       1          0     0          0
997       0          0     1          0
998       1          1     0          1
999       1          0     1          1
    model = bn.structure_learning.fit(df)
    # Compute edge strength with the chi_square test statistic
    model = bn.independence_test(model, df)
    G = bn.plot(model)

  • Choosing various methodtypes and scoringtypes:
    model_hc_bic  = bn.structure_learning.fit(df, methodtype='hc', scoretype='bic')
    model_hc_k2   = bn.structure_learning.fit(df, methodtype='hc', scoretype='k2')
    model_hc_bdeu = bn.structure_learning.fit(df, methodtype='hc', scoretype='bdeu')
    model_ex_bic  = bn.structure_learning.fit(df, methodtype='ex', scoretype='bic')
    model_ex_k2   = bn.structure_learning.fit(df, methodtype='ex', scoretype='k2')
    model_ex_bdeu = bn.structure_learning.fit(df, methodtype='ex', scoretype='bdeu')
    model_cl      = bn.structure_learning.fit(df, methodtype='cl', root_node='Wet_Grass')
    model_tan     = bn.structure_learning.fit(df, methodtype='tan', root_node='Wet_Grass', class_node='Rain')

Example: Parameter Learning

    import bnlearn as bn
    # Import dataframe
    df = bn.import_example()
    # As an example we set the CPD at False which returns an "empty" DAG
    model = bn.import_DAG('sprinkler', CPD=False)
    # Now we learn the parameters of the DAG using the df
    model_update = bn.parameter_learning.fit(model, df)
    # Make plot
    G = bn.plot(model_update)

Example: Inference

    import bnlearn as bn
    model = bn.import_DAG('sprinkler')
    query = bn.inference.fit(model, variables=['Rain'], evidence={'Cloudy':1,'Sprinkler':0, 'Wet_Grass':1})
    print(query)
    print(query.df)
    
    # Lets try another inference
    query = bn.inference.fit(model, variables=['Rain'], evidence={'Cloudy':1})
    print(query)
    print(query.df)

References

Contributors

Setting up and maintaining bnlearn has been possible thanks to users and contributors. Thanks to:

Citation

Please cite bnlearn in your publications if this is useful for your research. See column right for citation information.

Maintainer

  • Erdogan Taskesen, github: erdogant
  • Contributions are welcome.
  • If you wish to buy me a Coffee for this work, it is very appreciated :)

More Repositories

1

distfit

distfit is a python library for probability density fitting.
Jupyter Notebook
321
star
2

pca

pca: A Python Package for Principal Component Analysis.
Jupyter Notebook
252
star
3

findpeaks

The detection of peaks and valleys in a 1d-vector or 2d-array (image)
Python
179
star
4

d3graph

Creation of interactive networks using d3 Javascript
Jupyter Notebook
149
star
5

clustimage

clustimage is a python package for unsupervised clustering of images.
Jupyter Notebook
74
star
6

hgboost

hgboost is a python package for hyper-parameter optimization for xgboost, catboost or lightboost using cross-validation, and evaluating the results on an independent validation set. hgboost can be applied for classification and regression tasks.
Python
51
star
7

clusteval

Clusteval provides methods for unsupervised cluster validation
Jupyter Notebook
46
star
8

benfordslaw

benfordslaw is about the frequency distribution of leading digits.
Python
39
star
9

undouble

Python package undouble is to detect (near-)identical images.
Python
38
star
10

kaplanmeier

kaplanmeier is an python library to create survival curves using kaplan-meier, and compute the log-rank test.
Python
26
star
11

googletrends

Google trends is to examine trending google searches on geographical location and across time for input keywords.
Python
22
star
12

hnet

Association ruled based networks using graphical Hypergeometric Networks.
Python
21
star
13

caerus

Detection of favorable moments in time series data
Python
19
star
14

treeplot

Plot tree based machine learning models
Python
11
star
15

d3heatmap

d3heatmap is a Python package to create interactive heatmaps based on d3js.
HTML
9
star
16

flameplot

flameplot is a python package for the quantification of local similarity across two maps or embeddings.
Python
8
star
17

worldmap

This python package enables to color different countries in the world or the regions per country.
Python
7
star
18

ismember

ismember
Python
7
star
19

scatterd

Scatterd is a Python package for easy and fast creation of beautiful scatter plots.
Python
7
star
20

classeval

Evaluation of supervised predictions for two-class and multi-class classifiers
Python
5
star
21

imagesc

Make quick and beautiful heatmaps
Python
4
star
22

df2onehot

Convert a unstructured array into a stuctured dataframe.
Python
3
star
23

colourmap

Colourmap generates an unique lit of RGB and HEX colors for the specified input list
Python
3
star
24

datazets

Datazets is a python package to retrieve example data sets.
Python
3
star
25

pypickle

pypickle is for saving and loading files in pickle format.
Python
2
star
26

irelease

Library that automates releasing your Github python package at Pypi.
Python
2
star
27

thompson

Thompson is Python package to evaluate the multi-armed bandit problem. In addition to thompson, Upper Confidence Bound (UCB) algorithm, and randomized results are also implemented.
Python
2
star
28

dicter

Python package with advanced dictionary functions. Traverse through nested dicts. Set and get multiple keys. Flattens dicts. Store and load in json and more!
Python
2
star
29

relevantpackage

Example of a Python Package
Python
1
star
30

bnclassify

bnlearn
Python
1
star
31

d3plus

d3plus
Python
1
star