• Stars
    star
    328
  • Rank 127,554 (Top 3 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 6 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

CheXpert NLP tool to extract observations from radiology reports.

chexpert-labeler

CheXpert NLP tool to extract observations from radiology reports.

Read more about our project here and our AAAI 2019 paper here.

Prerequisites

Please install following dependencies or use the Dockerized labeler (see below).

  1. Clone the NegBio repository:
git clone https://github.com/ncbi-nlp/NegBio.git
  1. Add the NegBio directory to your PYTHONPATH:
export PYTHONPATH={path to negbio directory}:$PYTHONPATH
  1. Make the virtual environment:
conda env create -f environment.yml
  1. Activate the virtual environment:
conda activate chexpert-label
  1. Install NLTK data:
python -m nltk.downloader universal_tagset punkt wordnet
  1. Download the GENIA+PubMed parsing model:
>>> from bllipparser import RerankingParser
>>> RerankingParser.fetch_and_load('GENIA+PubMed')

Usage

Place reports in a headerless, single column csv {reports_path}. Each report must be contained in quotes if (1) it contains a comma or (2) it spans multiple lines. See sample_reports.csv (with output labeled_reports.csv)for an example.

python label.py --reports_path {reports_path}

Run python label.py --help for descriptions of all of the command-line arguments.

Dockerized Labeler

docker build -t chexpert-labeler:latest .
docker run -v $(pwd):/data chexpert-labeler:latest \
  python label.py --reports_path /data/sample_reports.csv --output_path /data/labeled_reports.csv --verbose

Contributions

This repository builds upon the work of NegBio.

This tool was developed by Jeremy Irvin, Pranav Rajpurkar, Michael Ko, Yifan Yu, and Silviana Ciurea-Ilcus.

Citing

If you're using the CheXpert labeling tool, please cite this paper:

@inproceedings{irvin2019chexpert,
  title={CheXpert: A large chest radiograph dataset with uncertainty labels and expert comparison},
  author={Irvin, Jeremy and Rajpurkar, Pranav and Ko, Michael and Yu, Yifan and Ciurea-Ilcus, Silviana and Chute, Chris and Marklund, Henrik and Haghgoo, Behzad and Ball, Robyn and Shpanskaya, Katie and others},
  booktitle={Thirty-Third AAAI Conference on Artificial Intelligence},
  year={2019}
}

More Repositories

1

ngboost

Natural Gradient Boosting for Probabilistic Prediction
Python
1,630
star
2

nlc

Neural Language Correction implemented on Tensorflow
Python
156
star
3

ManyICL

Python
103
star
4

CheXbert

Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using BERT
Python
101
star
5

stanfordmlgroup.github.io

Group Website
HTML
93
star
6

nlm-noising

Python
75
star
7

blm

Our project on using computer vision to combat computer vision for a cause I hope you care about. #BlackLivesMatter
Python
39
star
8

MoCo-CXR

MoCo-based unsupervised training for Chest X-Ray Interpretation
Python
38
star
9

disentanglement

Official repository for our ICLR 2021 paper Evaluating the Disentanglement of Deep Generative Models with Manifold Topology
Python
36
star
10

DLBCL-Morph

DLBCL-Morph dataset containing high resolution tissue microarray scans from 209 DLBCL cases, with geometric features computed using deep learning
Jupyter Notebook
35
star
11

MedSelect

Learns effective selective labeling strategies for medical images using deep reinforcement learning and meta learning
Python
25
star
12

cheXphoto

Code used in paper "CheXphoto: 10,000+ Smartphone Photos and Synthetic Photographic Transformations of Chest X-rays for Benchmarking Deep Learning Robustness"
Python
21
star
13

VisualCheXbert

Addressing the Discrepancy Between Radiology Report Labels and Image Labels
Python
21
star
14

dq

Queue system (jobs) on the deep cluster
Shell
20
star
15

cdr-mimic

Official Repository for our UAI paper Countdown Regression on the MIMIC-III Dataset
Python
18
star
16

selfsupervised-lungandheartsounds

Python
17
star
17

mobius

Jupyter Notebook
16
star
18

methane-gapfill-ml

Python codebase for gap-filling eddy covariance methane fluxes at FLUXNET-CH4 wetlands with machine learning.
Python
14
star
19

CheXseg

Code used in the paper "CheXseg: Combining Expert Annotations with DNN-generated Saliency Maps for X-ray Segmentation"
Python
12
star
20

LaunchPad

LaunchPad is a light-weighted Slurm job launcher designed for hyper-parameter search.
Python
11
star
21

risk-adjustment-ml

Incorporating machine learning and social determinants of health indicators into prospective risk adjustment for health plan payments.
Python
11
star
22

lca-code

LiverCancerAssistant
Python
10
star
23

ed-monitor-data

Python
9
star
24

MedAug

Public release for MedAug
Jupyter Notebook
8
star
25

Auto-Generate-WLs

Code repository supporting the paper "Auto-Generating Weak Labels for Real & Synthetic Data to Improve Label-Scarce Medical Image Segmentation" - MIDL 2024
Jupyter Notebook
8
star
26

CheXaid

Python
5
star
27

influenza-qtof

Novel metabolomics approach combined with machine learning for the diagnosis of influenza from nasopharyngeal specimens
Jupyter Notebook
3
star
28

ed-monitor-myocardial-injury

Jupyter Notebook
2
star
29

InterActive-Learning-Toolkit

Jupyter Notebook
1
star