• Stars
    star
    362
  • Rank 117,671 (Top 3 %)
  • Language
    Python
  • License
    Other
  • Created over 10 years ago
  • Updated 9 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Official Python client for accessing ChEMBL API

Latest Version License Supported Python versions Binder

ChEMBL webresource client

This is the only official Python client library developed and supported by ChEMBL group.

The library helps accessing ChEMBL data and cheminformatics tools from Python. You don't need to know how to write SQL. You don't need to know how to interact with REST APIs. You don't need to compile or install any cheminformatics frameworks. Results are cached.

The client handles interaction with the HTTPS protocol and caches all results in the local file system for faster retrieval. Abstracting away all network-related tasks, the client provides the end user with a convenient interface, giving the impression of working with a local resource. Design is based on the Django QuerySet interface. The client also implements lazy evaluation of results, which means it will only evaluate a request for data when a value is required. This approach reduces number of network requests and increases performance.

Installation

pip install chembl_webresource_client

Live Jupyter notebook with examples

Click here

Available filters

The design of the client is based on Django QuerySet (https://docs.djangoproject.com/en/1.11/ref/models/querysets) and most important lookup types are supported. These are:

  • exact
  • iexact
  • contains
  • icontains
  • in
  • gt
  • gte
  • lt
  • lte
  • startswith
  • istartswith
  • endswith
  • iendswith
  • range
  • isnull
  • regex
  • iregex
  • search

Only operator

only is a special method allowing to limit the results to a selected set of fields. only should take a single argument: a list of fields that should be included in result. Specified fields have to exists in the endpoint against which only is executed. Using only will usually make an API call faster because less information returned will save bandwidth. The API logic will also check if any SQL joins are necessary to return the specified field and exclude unnecessary joins with critically improves performance.

Please note that only has one limitation: a list of fields will ignore nested fields i.e. calling only(['molecule_properties__alogp']) is equivalent to only(['molecule_properties']).

For many 2 many relationships only will not make any SQL join optimisation.

Settings

In order to use settings you need to import them before using the client:

from chembl_webresource_client.settings import Settings

Settings object is a singleton that exposes Instance method, for example:

Settings.Instance().TIMEOUT = 10

Most important options:

CACHING: should results be cached locally (default is True)
CACHE_EXPIRE: cache expiry time in seconds (default 24 hours)
CACHE_NAME: name of the .sqlite file with cache
TOTAL_RETRIES: number of total retires per HTTP request (default is 3)
CONCURRENT_SIZE: total number of concurrent requests (default is 50)
FAST_SAVE: Speedup cache saving up to 50 times but with possibility of data loss (default is True)

Citing

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4489243/

More Repositories

1

ChEMBL_Structure_Pipeline

ChEMBL database structure pipelines
Python
191
star
2

FPSim2

Simple package for fast molecular similarity searches
Python
111
star
3

mychembl

Resources used to create the myChEMBL virtual machine
Jupyter Notebook
57
star
4

chembl_beaker

RDKit wrapper
Python
48
star
5

GLaDOS

Web Interface for ChEMBL @ EMBL-EBI
JavaScript
45
star
6

surechembl-data-client

A collection of scripts for retrieving, storing, and querying SureChEMBL data.
Python
34
star
7

tractability_pipeline_v2

Pipeline for assessing the tractability of potential targets (starting from Gene IDs)
Python
21
star
8

chembl_webservices_2

Source code of the ChEMBL web services.
Python
16
star
9

target_predictions

Python
12
star
10

autoencoder_ipython

Ipython notebook for blog post entry
Jupyter Notebook
12
star
11

chembl_multitask_model

Target prediction multitask neural network, with examples running it in Python, C++, Julia and JS
Python
11
star
12

notebooks

notebook repository
Jupyter Notebook
9
star
13

ModifiedNB

Popular cheminformatics NaΓ―ve Bayes model implemented in scikit-learn
Python
7
star
14

cbl_migrator

Migrates Oracle DBs to PostgreSQL, MySQL and SQLite
Python
7
star
15

GLaDOS-docs

Repository for storing the source files of the new interface documentation. https://chembl.gitbook.io/chembl-interface-documentation/
7
star
16

of_conformal

OpenFaaS function re-implementing https://doi.org/10.1186/s13321-018-0325-4 with LightGBM
Python
7
star
17

compound_target_pairs_dataset

Automatic extraction of interacting compound-target pairs from ChEMBL.
Python
7
star
18

antidote

An open platform for chemoinformatics and data-driven drug discovery applications
6
star
19

chembl_target_predictions

Set of script used by ChEMBL group to generate target predictions
Python
6
star
20

tractability_pipeline

Replaced by: https://github.com/chembl/tractability_pipeline_v2
Python
5
star
21

the-S3-amongos

S3 (AWS Simple Storage Service) server clone using MongoDB, PyMongo and Tornado.
Python
5
star
22

chembl_webservices_py3

ChEMBL Web Services in Python 3
Python
4
star
23

chembl_core_db

Python
4
star
24

chembl_core_model

Python
4
star
25

ChEMBL_NTD-Markdown

Markdown files for the new ChEMBL_NTD page: https://chembl.gitbook.io/chembl-ntd/
4
star
26

eodc_code_examples

Source code for code snippets sumbitted to the Expert Opinion on Drug Discovery
Python
3
star
27

ChEMBL-Loader-Documentation

A repository for the ChEMBL loader documentation as shown in gitbook. (https://chembl.gitbook.io/chembl-loader/)
3
star
28

chembl_invivo_assay

This repository identifies and annotates in vivo assays.
Python
3
star
29

chembl_assay_matrix

Python package generating compound co-occurance matrix for all assays from given document
CSS
2
star
30

sachem_elchem

Sachem Elchem plugin for elasticsearch
C
2
star
31

potsim2

PotSim2: Simple package to segment and compare protein potential grids
Python
2
star
32

mmv_train_image

Python
2
star
33

unichem2index

Queries Unichem's DB and Indexes the structure data into an Elasticsearch Index
Go
1
star
34

chembl_ws_2_es

Tools to migrate from ChEMBL web services to Elastic Search
Python
1
star
35

chembl_webservices

Python package providing chembl webservices API.
Python
1
star
36

Unichem-Documentation

1
star
37

structure_pipeline_binder

Jupyter Notebook
1
star
38

test_data

Repository to store some data to help in some tests and experiments.
1
star
39

surechembl-issues

Public issue report repository for SureChEMBL
1
star
40

idg_patents_paper

Perl
1
star
41

surechembl-docker-data-client

Dockerized example for Surechembl Data Client App.
Dockerfile
1
star
42

chembl_api

Python package providing full CRUD operations using REST out of ChEMBL model for internal web apps.
Python
1
star
43

openfaas_tp

Python
1
star
44

KNIME_REST_example

Example of accessing ChEMBL API from KNIME using KREST nodes.
1
star
45

speices_tagger

SPECIES tagger developed by Evangelos Pafilis et al.
C++
1
star
46

pfam_maps

Django app for a web-interface to manually curate mappings of small molecule binding to Pfam-A domains
JavaScript
1
star
47

chemistry_service

Python
1
star