stanfordmlgroup/ngboost

Stars
1,643
Rank 28,446 (Top 0.6 %)
Language
Python
License
Apache License 2.0
Created over 6 years ago
Updated 9 months ago

stanfordmlgroup/ngboost

stanfordmlgroup

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Natural Gradient Boosting for Probabilistic Prediction

NGBoost: Natural Gradient Boosting for Probabilistic Prediction

ngboost is a Python library that implements Natural Gradient Boosting, as described in "NGBoost: Natural Gradient Boosting for Probabilistic Prediction". It is built on top of Scikit-Learn, and is designed to be scalable and modular with respect to choice of proper scoring rule, distribution, and base learner. A didactic introduction to the methodology underlying NGBoost is available in this slide deck.

Installation

via pip

pip install --upgrade ngboost

via conda-forge

conda install -c conda-forge ngboost

Usage

Probabilistic regression example on the Boston housing dataset:

from ngboost import NGBRegressor

from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

X, Y = load_boston(True)
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2)

ngb = NGBRegressor().fit(X_train, Y_train)
Y_preds = ngb.predict(X_test)
Y_dists = ngb.pred_dist(X_test)

# test Mean Squared Error
test_MSE = mean_squared_error(Y_preds, Y_test)
print('Test MSE', test_MSE)

# test Negative Log Likelihood
test_NLL = -Y_dists.logpdf(Y_test).mean()
print('Test NLL', test_NLL)

Details on available distributions, scoring rules, learners, tuning, and model interpretation are available in our user guide, which also includes numerous usage examples and information on how to add new distributions or scores to NGBoost.

License

Apache License 2.0.

Reference

Tony Duan, Anand Avati, Daisy Yi Ding, Khanh K. Thai, Sanjay Basu, Andrew Y. Ng, Alejandro Schuler. 2019. NGBoost: Natural Gradient Boosting for Probabilistic Prediction. arXiv

chexpert-labeler

CheXpert NLP tool to extract observations from radiology reports.

nlc

Neural Language Correction implemented on Tensorflow

ManyICL

CheXbert

Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using BERT

stanfordmlgroup.github.io

nlm-noising

blm

Our project on using computer vision to combat computer vision for a cause I hope you care about. #BlackLivesMatter

MoCo-CXR

MoCo-based unsupervised training for Chest X-Ray Interpretation

disentanglement

Official repository for our ICLR 2021 paper Evaluating the Disentanglement of Deep Generative Models with Manifold Topology

DLBCL-Morph

DLBCL-Morph dataset containing high resolution tissue microarray scans from 209 DLBCL cases, with geometric features computed using deep learning

Jupyter Notebook

MedSelect

Learns effective selective labeling strategies for medical images using deep reinforcement learning and meta learning

cheXphoto

Code used in paper "CheXphoto: 10,000+ Smartphone Photos and Synthetic Photographic Transformations of Chest X-rays for Benchmarking Deep Learning Robustness"

VisualCheXbert

Addressing the Discrepancy Between Radiology Report Labels and Image Labels

dq

Queue system (jobs) on the deep cluster

cdr-mimic

Official Repository for our UAI paper Countdown Regression on the MIMIC-III Dataset

selfsupervised-lungandheartsounds

mobius

Jupyter Notebook

methane-gapfill-ml

Python codebase for gap-filling eddy covariance methane fluxes at FLUXNET-CH4 wetlands with machine learning.

CheXseg

Code used in the paper "CheXseg: Combining Expert Annotations with DNN-generated Saliency Maps for X-ray Segmentation"

LaunchPad

LaunchPad is a light-weighted Slurm job launcher designed for hyper-parameter search.

risk-adjustment-ml

Incorporating machine learning and social determinants of health indicators into prospective risk adjustment for health plan payments.

lca-code

LiverCancerAssistant

Auto-Generate-WLs

Code repository supporting the paper "Auto-Generating Weak Labels for Real & Synthetic Data to Improve Label-Scarce Medical Image Segmentation" - MIDL 2024

Jupyter Notebook

MedAug

Public release for MedAug

Jupyter Notebook

CheXaid

influenza-qtof

Novel metabolomics approach combined with machine learning for the diagnosis of influenza from nasopharyngeal specimens

Jupyter Notebook

ed-monitor-data

Used to extract and process bedside monitoring data from the ED. Also contains a module useful for training machine learning models on the extracted data.

ed-monitor-myocardial-injury

Jupyter Notebook

InterActive-Learning-Toolkit

Jupyter Notebook