• Stars
    star
    127
  • Rank 282,790 (Top 6 %)
  • Language
    Jupyter Notebook
  • Created almost 8 years ago
  • Updated over 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Deep Markov Models

Deep Markov Models

Overview

This repository contains theano code for implementing Deep Markov Models. The code is documented and should be easy to modify for your own applications.

Deep Markov Model

The code uses variational inference during learning to maximize the likelihood of the observed data:

Evidence Lower Bound

  • Generative Model
    • The figure depicts a state space model for time-varying data.
    • The latent variables z1...zT and the observations x1...xT together describe the generative process for the data.
    • The emission p(x_t|z_t) and transition functions p(z_t|z_{t-1}) are parameterized by deep neural networks
  • Inference Model
    • The function q(z1..zT | x1...xT) represents the inference network
    • This is a parametric approximation to the variational posterior used for learning at training time and for inference at test time.

Requirements

This package has the following requirements:

  • python2.7
  • Theano
    • Used for automatic differentiation
  • theanomodels
    • Wrapper around theano that takes care of bookkeeping, saving/loading models etc. Clone the github repository and add its location to the PYTHONPATH environment variable so that it is accessible by python.
  • An NVIDIA GPU w/ atleast 6G of memory is recommended.

Optional

I used the following ~/.theanorc configuration file:

[global]
floatX=float32

[mode]=FAST_RUN

[nvcc]
fastmath=True

[cuda]
root=/usr/local/cuda

You can change whether the model is run on the GPU or CPU by modifying the THEANO_FLAGS. See here for documentation.

Folders

  • Model Code: model_th: This folder contains raw theano code implementing the model. See the folder for details on how the DMM was implementation and pointers to portions of the code.
  • Datasets: dmm_data: This folder contains code to load the polyphonic music data and a synthetic dataset. Add or change code in load.py(dmm_data/load.py) to run the model on your own data.
  • Tutorials: ipynb: This folder contains some IPython notebooks with examples on loading and running the model on your own data.
  • Hyperparameters: parse_args.py: This file contains hyperparameters used by the model. Run python parse_args.py -h for an explanation of what the various choices of parameters change in the generative model and inference network.
  • Modeling Polyphonic Music: expt: Experimental setup for running the DMM on the polyphonic music dataset
  • Template Folder for Training DMMs: expt_template : Experimental setup for running the DMM on synthetic real-valued observations.

Running the model on your data

  • A general purpose tutorial for setting up and running the model can be found in the IPython Notebooks.
  • The code currently supports binary and real-valued data. An example of modeling binary data may be found in expt/.

References:

Please cite the following paper if you find the code useful in your research:

@inproceedings{krishnan2016structured,
  title={Structured Inference Networks for Nonlinear State Space Models},
  author={Krishnan, Rahul G and Shalit, Uri and Sontag, David},
  booktitle={AAAI},
  year={2017}
}

This paper subsumes the work in : [Deep Kalman Filters] (https://arxiv.org/abs/1511.05121)

More Repositories

1

cfrnet

Counterfactual Regression
Python
261
star
2

structuredinference

Structured Inference Networks for Nonlinear State Space Models
Jupyter Notebook
255
star
3

embeddings

Code for AMIA CRI 2016 paper "Learning Low-Dimensional Representations of Medical Concepts"
Python
233
star
4

TabLLM

Python
162
star
5

deepDiagnosis

A torch package for learning diagnosis models from temporal patient data.
Lua
110
star
6

HealthKnowledgeGraph

Health knowledge graph for 157 diseases and 491 symptoms, learned from >270,000 patients' data
96
star
7

co-llm

Co-LLM: Learning to Decode Collaboratively with Multiple Language Models
Python
87
star
8

omop-learn

Python package for machine learning for healthcare using a OMOP common data model
Python
86
star
9

prancer

Platform enabling Rapid Annotation for Clinical Entity Recognition
JavaScript
48
star
10

gumbel-max-scm

Code for "Counterfactual Off-Policy Evaluation with Gumbel-Max Structural Causal Models" (ICML 2019)
Python
39
star
11

ML-tools

Miscellaneous tools for clinical ML
Python
30
star
12

human_ai_deferral

Human-AI Deferral Evaluation Benchmark (Learning to Defer) AISTATS23
Python
18
star
13

anchorExplorer

Python
17
star
14

trajectory-inspection

Code for "Trajectory Inspection: A Method for Iterative Clinician-Driven Design of Reinforcement Learning Studies"
Jupyter Notebook
16
star
15

cotrain-prompting

Code for co-training large language models (e.g. T0) with smaller ones (e.g. BERT) to boost few-shot performance
Python
15
star
16

ContextualAutocomplete_MLHC2020

Code for Contextual Autocomplete paper published in MLHC2020
Jupyter Notebook
13
star
17

realhumaneval

Jupyter Notebook
12
star
18

teaching-to-understand-ai

Code and webpages for our study on teaching humans to defer to an AI
Jupyter Notebook
11
star
19

dgm

Deep Generative Model (Torch)
Lua
11
star
20

learn-to-defer

Code for "Consistent Estimators for Learning to Defer to an Expert" (ICML 2020)
Jupyter Notebook
11
star
21

sc-foundation-eval

Code for evaluating single cell foundation models scBERT and scGPT
Jupyter Notebook
10
star
22

SparsityBoost

http://cs.nyu.edu/~dsontag/papers/BrennerSontag_uai13.pdf
Python
10
star
23

proxy-anchor-regression

Code for ICML 2021 paper "Regularizing towards Causal Invariance: Linear Models with Proxies" (ICML 2021)
Jupyter Notebook
10
star
24

onboarding_human_ai

Onboarding Humans to work with AI: Algorithms to find regions and describe them in natural language that show how humans should collaborate with AI (NeurIPS23)
Jupyter Notebook
10
star
25

vae_ssl

Scalable semi-supervised learning with deep variational autoencoders
Jupyter Notebook
9
star
26

amr-uti-stm

Code for "A decision algorithm to promote outpatient antimicrobial stewardship for uncomplicated urinary tract infection"
Python
8
star
27

dgc_predict

Applies and evaluates a variety of methods to complete a partially-observed data tensor, e.g. comprising gene expression profiles corresponding to various drugs, applied in various cellular contexts.
R
8
star
28

mimic-language-model

A conditional language model for MIMIC-III.
Python
8
star
29

ml_mmrf

Machine Learning with data from the Multiple Myeloma Research Foundation
Jupyter Notebook
7
star
30

overparam

Python
6
star
31

ckd_progression

Python
6
star
32

parametric-robustness-evaluation

Code for paper "Evaluating Robustness to Dataset Shift via Parametric Robustness Sets"
Python
5
star
33

active_learn_to_defer

Code for Sample Efficient Learning of Predictors that Complement Humans (ICML 2022)
Python
5
star
34

surprising-sepsis

Python
4
star
35

large-scale-temporal-shift-study

Code for Large-Scale Study of Temporal Shift in Health Insurance Claims. Christina X Ji, Ahmed M Alaa, David Sontag. CHIL, 2023. https://arxiv.org/abs/2305.05087
Python
4
star
36

amr-uti-kdd

Treatment Policy Learning in Multiobjective Settings with Fully Observed Outcomes (KDD 2020)
Python
4
star
37

theanomodels

A lightweight wrapper around theano for rapid-prototyping of models
Python
3
star
38

clinical-anchors

Python
3
star
39

finding-decision-heterogeneity-regions

Code for "Finding Regions of Heterogeneity in Decision-Making via Expected Conditional Covariance" at NeurIPS 2021
Jupyter Notebook
3
star
40

fully-observed-policy-learning

Code for "Treatment Policy Learning in Multiobjective Settings with Fully Observed Outcomes" (KDD 2020)
Jupyter Notebook
3
star
41

mimic_annotations

2
star
42

fw-inference

Barrier Frank-Wolfe for Marginal Inference
C++
2
star
43

oncology_rationale_extraction

Functionality from "Automated NLP extraction of clinical rationale for treatment discontinuation in breast cancer"
Python
2
star
44

overlap-code

Code for "Characterization of Overlap in Observational Studies" (AISTATS 2020)
Python
2
star
45

omop-variation

Tools to identify and evaluate heterogeneity in decision-making processes.
Python
2
star
46

clinicalml-scBERT-NMI

analysis code to reproduce results in NMI submission
Jupyter Notebook
1
star
47

rct-obs-extrapolation

Code for paper, "Falsification before Extrapolation in Causal Effect Estimation"
Jupyter Notebook
1
star