• Stars
    star
    268
  • Rank 152,314 (Top 4 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 8 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Automatically extract chemical information from scientific documents

ChemDataExtractor

http://img.shields.io/pypi/v/ChemDataExtractor.svg?style=flat-square http://img.shields.io/pypi/l/ChemDataExtractor.svg?style=flat-square http://img.shields.io/travis/mcs07/ChemDataExtractor.svg?style=flat-square

ChemDataExtractor is a toolkit for extracting chemical information from the scientific literature.

Features

  • HTML, XML and PDF document readers
  • Chemistry-aware natural language processing pipeline
  • Chemical named entity recognition
  • Rule-based parsing grammars for property and spectra extraction
  • Table parser for extracting tabulated data
  • Document processing to resolve data interdependencies

Installation

To install ChemDataExtractor, simply run:

pip install chemdataextractor

Or if you are an Anaconda user, run:

conda install -c chemdataextractor chemdataextractor

Alternatively, try one of the other installation options.

Documentation

Full documentation is available at http://chemdataextractor.org/docs

License

ChemDataExtractor is licensed under the MIT license, a permissive, business-friendly license for open source software.