• Stars
    star
    171
  • Rank 221,018 (Top 5 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created about 9 years ago
  • Updated 11 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Peptide-MHC I binding affinity prediction

Build Status Open In Colab

mhcflurry

MHC I ligand prediction package with competitive accuracy and a fast and documented implementation.

MHCflurry implements class I peptide/MHC binding affinity prediction. The current version provides pan-MHC I predictors supporting any MHC allele of known sequence. MHCflurry runs on Python 3.4+ using the tensorflow neural network library. It exposes command-line and Python library interfaces.

MHCflurry also includes two expermental predictors, an "antigen processing" predictor that attempts to model MHC allele-independent effects such as proteosomal cleavage and a "presentation" predictor that integrates processing predictions with binding affinity predictions to give a composite "presentation score." Both models are trained on mass spec-identified MHC ligands.

If you find MHCflurry useful in your research please cite:

T. O'Donnell, A. Rubinsteyn, U. Laserson. "MHCflurry 2.0: Improved pan-allele prediction of MHC I-presented peptides by incorporating antigen processing," Cell Systems, 2020. https://doi.org/10.1016/j.cels.2020.06.010

T. O’Donnell, A. Rubinsteyn, M. Bonsack, A. B. Riemer, U. Laserson, and J. Hammerbacher, "MHCflurry: Open-Source Class I MHC Binding Affinity Prediction," Cell Systems, 2018. https://doi.org/10.1016/j.cels.2018.05.014

Please file an issue if you have questions or encounter problems.

Have a bugfix or other contribution? We would love your help. See our contributing guidelines.

Try it now

You can generate MHCflurry predictions without any setup by running our Google colaboratory notebook.

Installation (pip)

Install the package:

$ pip install mhcflurry

If you don't already have it, you will also need to install tensorflow version 2.2.0 or later. On most platforms you can do this with:

$ pip install tensorflow

If you are on Apple silicon (M1 processor), then you'll need to run pip install tensorflow-macos instead. See these instructions for more info.

Next download our datasets and trained models:

$ mhcflurry-downloads fetch

You can now generate predictions:

$ mhcflurry-predict \
       --alleles HLA-A0201 HLA-A0301 \
       --peptides SIINFEKL SIINFEKD SIINFEKQ \
       --out /tmp/predictions.csv
       
Wrote: /tmp/predictions.csv

Or scan protein sequences for potential epitopes:

$ mhcflurry-predict-scan \
        --sequences MFVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPDKVFRSSVLHS \
        --alleles HLA-A*02:01 \
        --out /tmp/predictions.csv
        
Wrote: /tmp/predictions.csv  

See the documentation for more details.

Docker

You can also try the latest (GitHub master) version of MHCflurry using the Docker image hosted on Dockerhub by running:

$ docker run -p 9999:9999 --rm openvax/mhcflurry:latest

This will start a jupyter notebook server in an environment that has MHCflurry installed. Go to http://localhost:9999 in a browser to use it.

To build the Docker image yourself, from a checkout run:

$ docker build -t mhcflurry:latest .
$ docker run -p 9999:9999 --rm mhcflurry:latest

Predicted sequence motifs

Sequence logos for the binding motifs learned by MHCflurry BA are available here.

Common issues and fixes

Problems downloading data and models

Some users have reported HTTP connection issues when using mhcflurry-downloads fetch. As a workaround, you can download the data manually (e.g. using wget) and then use mhcflurry-downloads just to copy the data to the right place.

To do this, first get the URL(s) of the downloads you need using mhcflurry-downloads url:

$ mhcflurry-downloads url models_class1_presentation
https://github.com/openvax/mhcflurry/releases/download/1.6.0/models_class1_presentation.20200205.tar.bz2```

Then make a directory and download the needed files to this directory:

$ mkdir downloads
$ wget  --directory-prefix downloads https://github.com/openvax/mhcflurry/releases/download/1.6.0/models_class1_presentation.20200205.tar.bz2```

HTTP request sent, awaiting response... 200 OK
Length: 72616448 (69M) [application/octet-stream]
Saving to: 'downloads/models_class1_presentation.20200205.tar.bz2'

Now call mhcflurry-downloads fetch with the --already-downloaded-dir option to indicate that the downloads should be retrived from the specified directory:

$ mhcflurry-downloads fetch models_class1_presentation --already-downloaded-dir downloads

More Repositories

1

pyensembl

Python interface to access reference genome features (such as genes, transcripts, and exons) from Ensembl
Python
338
star
2

gtfparse

Parsing tools for GTF (gene transfer format) files
Python
92
star
3

mhctools

Python interface to running command-line and web-based MHC binding predictors
Python
78
star
4

varcode

Library for manipulating genomic variants and predicting their effects
Python
75
star
5

neoantigen-vaccine-pipeline

Bioinformatics pipeline for selecting patient-specific cancer neoantigen vaccines
Jupyter Notebook
68
star
6

vaxrank

Ranked vaccine peptides for personalized cancer immunotherapy
Python
49
star
7

pepdata

Python interface to amino acid properties and IEDB
Python
48
star
8

topiary

Predict mutated T-cell epitopes from sequencing data
Python
27
star
9

isovar

Assembly of RNA reads to determine the effect of a cancer mutation on protein sequence
Python
22
star
10

pepnet

Neural networks for amino acid sequences
Python
20
star
11

varlens

commandline manipulation of genomic variants and NGS reads
Python
19
star
12

gene-lists

Gene lists related to cancer immunotherapy
13
star
13

tcga-immune-deconvolution

Immune deconvolution of publicly available TCGA expression data
Jupyter Notebook
11
star
14

mhcnames

All the fun and adventure of MHC naming, now in Python
Python
10
star
15

datacache

Helpers for transparently downloading datasets
Python
5
star
16

cancer-cell-line-mhc-alleles

Cell line HLA types and neoepitope catalog from TCLP
Jupyter Notebook
4
star
17

mhcdouble

Class II MHC binding and antigen processing prediction
Python
4
star
18

ott-wu-2017-data

Machine readable data from "An Immunogenic Personal Neoantigen Vaccine for Melanoma Patients"
Jupyter Notebook
4
star
19

mhc2-data

Class II MHC data
Jupyter Notebook
2
star
20

sahin-2017-data

Machine readable data from "Personalized RNA mutanome vaccines mobilize poly-specific therapeutic immunity against cancer"
1
star
21

vaxrank-paper-2018

Repository for updated Vaxrank paper
TeX
1
star
22

proteopt

Common interface to protein design tools and structure predictors
Python
1
star
23

mhcflurry-motifs

Motifs for MHC I alleles as predicted by MHCflurry
Python
1
star
24

mhc2flurry

MHC class II binding predictor, under development
Jupyter Notebook
1
star
25

mhcflurry-web

Webapp for MHCflurry predictor
CSS
1
star
26

cov-2-mutations-by-lineage

Quick analysis to associate SARS-Cov-2 spike mutations with pangolin lineages using GISAID data
Jupyter Notebook
1
star