• Stars
    star
    110
  • Rank 314,916 (Top 7 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 6 years ago
  • Updated about 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Mass Spectrometry for Small Molecules using Deep Learning

Deep learning for Electron Ionization mass spectrometry for organic molecules

TOC

This repository accompanies

Rapid Prediction of Electron–Ionization Mass Spectrometry Using Neural Networks
Jennifer N. Wei, David Belanger, Ryan P. Adams, and D. Sculley
ACS Central Science 2019 5 (4), 700-708
DOI: 10.1021/acscentsci.9b00085

Introduction

We predict the mass spectrometry spectra of molecules using deep learning techniques applied to various molecule representations. The performance behavior is evaluated with a custom-made library matching task. In this task we identify molecules by matching its spectra to a library of labeled spectra. As a baseline, this library contains all of the molecules in the NIST main library, which mimics the behavior currently used by experimental chemists. To test our predictions, we replace portions of the library with spectra predictions from our model. This task is described in more detail below.

Required Packages:

It is recommended to use Anaconda with a Python 3.6 environment to install these packages.

Most of the packages required here can be installed with conda install tensorflow=1.13.2 rdkit matplotlib and pip install absl-py.

Quickstart Guide for Making Model Predictions

  1. Create a directory and download the weights for the model.
$ MODEL_WEIGHTS_DIR=/home/path/to/model
$ mkdir $MODEL_WEIGHTS_DIR
$ pushd $MODEL_WEIGHTS_DIR
$ curl -o https://storage.googleapis.com/deep-molecular-massspec/massspec_weights/massspec_weights.zip
$ unzip massspec_weights.zip
$ popd
  1. Run the model prediction on the example molecule
$ python make_spectra_prediction.py \
--input_file=examples/pentachlorobenzene.sdf \
--output_file=/tmp/annotated.sdf \
--weights_dir=$MODEL_WEIGHTS_DIR/massspec_weights

Training splits for benchmarking purposes

The molecules used for the training, validation, and test sets can be found under the directory training_splits. The molecules are provided in inchikey and smiles format.

To cite this work:

@article{doi:10.1021/acscentsci.9b00085,
author = {Wei, Jennifer N. and Belanger, David and Adams, Ryan P. and Sculley, D.},
title = {Rapid Prediction of Electron–Ionization Mass Spectrometry Using Neural Networks},
journal = {ACS Central Science},
volume = {5},
number = {4},
pages = {700-708},
year = {2019},
doi = {10.1021/acscentsci.9b00085},
URL = {https://doi.org/10.1021/acscentsci.9b00085},\ }

More Repositories

1

self-attention-gan

Python
976
star
2

realistic-ssl-evaluation

Open source release of the evaluation benchmark suite described in "Realistic Evaluation of Deep Semi-Supervised Learning Algorithms"
Python
452
star
3

guided-evolutionary-strategies

Guided Evolutionary Strategies
Jupyter Notebook
263
star
4

acai

Code for "Understanding and Improving Interpolation in Autoencoders via an Adversarial Regularizer"
Python
240
star
5

mpnn

Open source implementation of "Neural Message Passing for Quantum Chemistry"
Python
220
star
6

tensorfuzz

A library for performing coverage guided fuzzing of neural networks
Python
204
star
7

nngp

Deep neural network kernel for Gaussian process
Python
194
star
8

l2hmc

TensorFlow implementation for training MCMC samplers from the paper: Generalizing Hamiltonian Monte Carlo with Neural Network
Jupyter Notebook
180
star
9

long-term-video-prediction-without-supervision

Implementation of Hierarchical Long-term Video Prediction without Supervision
Python
91
star
10

data-linter

The Data Linter identifies potential issues (lints) in your ML training data.
Python
84
star
11

conv-sv

The Singular Values of Convolutional Layers
Python
71
star
12

ncp

Reliable Uncertainty Estimates in Deep Neural Networks using Noise Contrastive Priors
Python
63
star
13

mean-field-cnns

Jupyter Notebook
35
star
14

mirage-rl

Code to reproduce the experiments in The Mirage of Action-Dependent Baselines in Reinforcement Learning.
Python
17
star
15

LeaveNoTrace

Leave No Trace is an algorithm for safe reinforcement learning.
Python
15
star
16

fisher-rao-regularization

Python
10
star
17

wip-lambada-lm

LSTM language model on LAMBADA dataset
Python
9
star
18

hyperbolictext

TensorFlow source code for learning embeddings of text sequences in an unsupervised manner.
Python
8
star
19

wip-constrained-extractor

Work in progress inference, learning, and evaluation code for extractive summarization.
Python
6
star
20

flying-shapes

A potentially infinite dataset of coloured shapes which bounce around on a black background.
Python
4
star
21

metaq

Python
3
star