• Stars
    star
    102
  • Rank 335,584 (Top 7 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 7 years ago
  • Updated over 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Saber is a deep-learning based tool for information extraction in the biomedical domain. Pull requests are welcome! Note: this is a work in progress. Many things are broken, and the codebase is not stable.

Saber

Travis CI Codacy Status Coverage Status PRs Welcome License Colab Slack

Saber (Sequence Annotator for Biomedical Entities and Relations) is a deep-learning based tool for information extraction in the biomedical domain.

Installation • Quickstart • Documentation

Installation

Note! This is a work in progress. Many things are broken, and the codebase is not stable.

To install Saber, you will need python3.6.

Latest PyPI stable release

PyPI-Status PyPI-Downloads Libraries-Dependents

(saber) $ pip install saber

The install from PyPI is currently broken, please install using the instructions below.

Latest development release on GitHub

GitHub-Status GitHub-Stars GitHub-Forks GitHub-Commits GitHub-Updated

Pull and install straight from GitHub

(saber) $ pip install git+https://github.com/BaderLab/saber.git

or install by cloning the repository

(saber) $ git clone https://github.com/BaderLab/saber.git
(saber) $ cd saber

and then using either pip

(saber) $ pip install -e .

or setuptools

(saber) $ python setup.py install

See the documentation for more detailed installation instructions.

Quickstart

If your goal is to use Saber to annotate biomedical text, then you can either use the web-service or a pre-trained model. If you simply want to check Saber out, without installing anything locally, try the Google Colaboratory notebook.

Google Colaboratory

The fastest way to check out Saber is by following along with the Google Colaboratory notebook (Colab). In order to be able to run the cells, select "Open in Playground" or, alternatively, save a copy to your own Google Drive account (File > Save a copy in Drive).

Web-service

To use Saber as a local web-service, run

(saber) $ python -m saber.cli.app

or, if you prefer, you can pull & run the Saber image from Docker Hub

# Pull Saber image from Docker Hub
$ docker pull pathwaycommons/saber
# Run docker (use `-dt` instead of `-it` to run container in background)
$ docker run -it --rm -p 5000:5000 --name saber pathwaycommons/saber

There are currently two endpoints, /annotate/text and /annotate/pmid. Both expect a POST request with a JSON payload, e.g.,

{
  "text": "The phosphorylation of Hdm2 by MK2 promotes the ubiquitination of p53."
}

or

{
  "pmid": 11835401
}

For example, running the web-service locally and using cURL

$ curl -X POST 'http://localhost:5000/annotate/text' \
--data '{"text": "The phosphorylation of Hdm2 by MK2 promotes the ubiquitination of p53."}'

Documentation for the Saber web-service API can be found here.

Pre-trained models

First, import the Saber class. This is the interface to Saber

from saber.saber import Saber

then create a Saber object

saber = Saber()

and then load the model of our choice

saber.load('PRGE')

To annotate text with the model, just call the Saber.annotate() method

saber.annotate("The phosphorylation of Hdm2 by MK2 promotes the ubiquitination of p53.")

See the documentation for more details on using pre-trained models.

Documentation

Documentation for the Saber package can be found here. The web-service API has its own documentation here.

You can also call help() on any Saber method for more information

from saber import Saber

saber = Saber()

help(saber.annotate)

or pass the --help flag to any of the command-line interfaces

python -m src.cli.train --help

Feel free to open an issue or reach out to us on our slack channel (Slack) for more help.

More Repositories

1

CellAnnotationTutorial

Accompanying code for the tutorial: Annotating single cell transcriptomic maps using automated and manual methods
HTML
86
star
2

Cytoscape_workflows

collection of notebooks with different cytoscape workflows
HTML
47
star
3

scClustViz

Explore and share your scRNAseq clustering results
R
46
star
4

Biomedical-Corpora

A collection of annotated biomedical corpora, which can be used for training supervised machine learning methods for various tasks in biomedical text-mining and information extraction.
37
star
5

Transfer-Learning-BNER-Bioinformatics-2018

This repository contains supplementary data, and links to the model and corpora used for the paper: Transfer learning for biomedical named entity recognition with neural networks.
Python
36
star
6

ecuda

STL-like containers (array, vector, matrix, cube) useable in device code.
C++
31
star
7

EnrichmentMapApp

The EnrichmentMap Cytoscape App allows you to visualize the results of gene-set enrichment as a network.
Java
31
star
8

HumanLiver

R Data: Single cell RNA sequencing of human liver reveals distinct intrahepatic macrophage populations
R
27
star
9

Tempora

Pathway-based trajectory inference method for time-series scRNAseq data
R
25
star
10

singleLiverCells

Scripts for processing single liver cells RNA-Seq 10X Genomics Data
R
23
star
11

AgingMouseBrainCCInx

Predicted cell-cell interactions of the aging mouse brain
R
19
star
12

EPIC

This is the public repository for the EPIC tool.
Python
17
star
13

Towards-reliable-BioNER

This repository contains the corpora and supplementary data, along with instructions for recreating the experiments, for our paper: "Towards reliable named entity recognition in the biomedical domain".
Python
15
star
14

netDx

R package with netDx software and data for examples
R
12
star
15

scRNAseqWorkflow

Brendan's skeleton scRNAseq workflow using scran, Seurat, and scClustViz
R
12
star
16

MALAT1_threshold

R
9
star
17

GenomeClinic-PGX

Web application for clinical pharmacogenomic interpretation
JavaScript
9
star
18

CCInx

Build predicted cell-cell interaction networks from single-cell data.
HTML
9
star
19

cy3d-impl

3D Graph Renderer for Cytoscape using OpenGL
Java
9
star
20

openPIP

The open protein interaction platform repo
JavaScript
7
star
21

MCODE

Cytoscape app that clusters a given network based on topology to find densely connected regions
Java
7
star
22

SocialNetworkApp

dropbox
Java
7
star
23

AutoAnnotateApp

The AutoAnnotate Cytoscape App finds clusters of nodes and visually annotates them with semantic labels and groups.
Java
6
star
24

GeneEval

A Python library for evaluating gene embeddings.
Python
4
star
25

MouseCortex

R data: Developmental emergence of adult neural stem cells as revealed by single cell transcriptional profiling
R
4
star
26

EM-tutorials-docker

Jupyter Notebook
3
star
27

EnrichmentMap_docs

End-User Documentation for the EnrichmentMap Cytoscape App
Python
3
star
28

Cell_Cycle_Theory

Mathematica
2
star
29

EnrichmentMap_monthlyGenesetBuild

Java
2
star
30

EasycyRest

Suite of wrappers for CyREST functions in R
R
2
star
31

WordCloudPlugin

WordCloud Cytoscape app
Java
2
star
32

Cytoscape-workshop-docker

Jupyter Notebook
2
star
33

pharmacogenomics

Pharmacogenomics app for MedSavant
Java
2
star
34

biopax-jsonld

A BioPAX to/from JSON-LD format converter.
Web Ontology Language
1
star
35

covid19-dashboard

JavaScript
1
star
36

POPPATHR

Population-based pathway analysis of SNP-SNP coevolution
R
1
star
37

StudentProjectIdeas

Project ideas for summer and co-op students
1
star
38

PNC_PathwayAnalysis

Pathway analysis pipeline for PNC data (GWAS + GSEA)
Python
1
star
39

BreastCancer_PathwayAnalysis

BreastCancer Pathway Analysis performed for Nature/Nature Genetics paper 2017 (for ERNeg and Overall breast cancer)
Java
1
star
40

RatLiver

a single-cell atlas of the rat liver
R
1
star
41

SummExpDR

Wrapper for SummarizedExperiment Objects for performing data integration + dimensionality reduction
R
1
star
42

gproxy

A proxy server for Google Analytics to bypass adblockers etc
JavaScript
1
star
43

Rcy3IntroWorkshop

Introductory workshop for cytoscape automation with RCy3
1
star
44

PromoterPredictor

Using a SVM and 5 identified features, scan genomic coordinates for putative promoter regions
Python
1
star