DiffAb
Antigen-Specific Antibody Design and Optimization with Diffusion-Based Generative Models for Protein Structures (NeurIPS 2022)
Install
Environment
conda env create -f env.yaml -n diffab
conda activate diffab
The default cudatoolkit
version is 11.3. You may change it in env.yaml
.
Datasets and Trained Weights
Protein structures in the SAbDab
dataset can be downloaded here. Extract all_structures.zip
into the data
folder.
The data
folder contains a snapshot of the dataset index (sabdab_summary_all.tsv
). You may replace the index with the latest version here.
Trained model weights are available here (Hugging Face) or here (Google Drive).
[Optional] HDOCK
HDOCK is required to design CDRs for antigens without bound antibody frameworks. Please download HDOCK here and put the hdock
and createpl
programs into the bin
folder.
[Optional] PyRosetta
PyRosetta is required to relax the generated structures and compute binding energy. Please follow the instruction here to install.
[Optional] Ray
Ray is required to relax and evaluate the generated antibodies. Please install Ray using the following command:
pip install -U ray
Design Antibodies
5 design modes are available. Each mode corresponds to a config file in the configs/test
folder:
Config File | Description |
---|---|
codesign_single.yml |
Sample both the sequence and structure of one CDR. |
codesign_multicdrs.yml |
Sample both the sequence and structure of all the CDRs simultaneously. |
abopt_singlecdr.yml |
Optimize the sequence and structure of one CDR. |
fixbb.yml |
Sample only the sequence of one CDR (fix-backbone sequence design). |
strpred.yml |
Sample only the structure of one CDR (structure prediction). |
Antibody-Antigen Complex
Below is the usage of design_pdb.py
. It samples CDRs for antibody-antigen complexes. The full list of options can be found in diffab/tools/runner/design_for_pdb.py
.
python design_pdb.py \
<path-to-pdb> \
--heavy <heavy-chain-id> \
--light <light-chain-id> \
--config <path-to-config-file>
The --heavy
and --light
options can be omitted as the script can automatically identify them with AbNumber and ANARCI.
The below example designs the six CDRs separately for the 7DK2_AB_C
antibody-antigen complex.
python design_pdb.py ./data/examples/7DK2_AB_C.pdb \
--config ./config/test/codesign_single.yml
Antigen Only
HDOCK is required to design antibodies for antigens without bound antibody structures (see above for instructions on installing HDOCK). Below is the usage of design_dock.py
.
python design_dock.py \
--antigen <path-to-antigen-pdb> \
--antibody <path-to-antibody-template-pdb> \
--config <path-to-config-file>
The --antibody
option is optional and the default antibody template is 3QHF_Fv.pdb
. The full list of options can be found in the script.
Below is an example that designs antibodies for SARS-CoV-2 Omicron RBD.
python design_dock.py \
--antigen ./data/examples/Omicron_RBD.pdb \
--config ./config/test/codesign_multicdrs.yml
Train
python train.py ./configs/train/<config-file-name>
Reference
@inproceedings{luo2022antigenspecific,
title={Antigen-Specific Antibody Design and Optimization with Diffusion-Based Generative Models for Protein Structures},
author={Shitong Luo and Yufeng Su and Xingang Peng and Sheng Wang and Jian Peng and Jianzhu Ma},
booktitle={Advances in Neural Information Processing Systems},
editor={Alice H. Oh and Alekh Agarwal and Danielle Belgrave and Kyunghyun Cho},
year={2022},
url={https://openreview.net/forum?id=jSorGn2Tjg}
}