• Stars
    star
    274
  • Rank 149,395 (Top 3 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created about 5 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Deep learning with text doesn't have to be scary.
gobbli logo
PyPI version PyPI version PyPI - Python Version DOI

This is a library designed to provide a uniform interface to various deep learning models for text via programmatically created Docker containers.

Usage

See the docs for prerequisites, a quickstart, and the API reference. In brief, you need Docker installed with appropriate permissions for your user account to run Docker commands and Python 3.7. Then run the following:

pip install gobbli

You may also want to check out the benchmarks to see some comparisons of gobbli's implementation of various models in different situations.

Interactive

gobbli provides streamlit apps to perform some interactive tasks in a web browser, such as data exploration and model evaluation. Once you've installed the library, you can run the bundled apps using the gobbli command line application. Check the docs for more information.

Development

Assuming you have all prerequisites noted above, you need to install the package and all required + optional dependencies in development mode:

pip install -e ".[augment,tokenize,interactive]"

Install additional dev dependencies:

pip install -r requirements.txt

Run linting, autoformatting, and tests:

./run_ci.sh

To avoid manually fixing some of these errors, consider enabling isort and black support in your favorite editor.

If you're running tests in an environment with less than 12GB of memory, you'll want to pass the --low-resource argument when running tests to avoid out of memory errors.

NOTE: If running on a Mac, even with adequate memory available, you may encounter Out of Memory errors (exit status 137) when running the tests. This is due to not enough memory being allocated to your Docker daemon. Try going to Docker for Mac -> Preferences -> Advanced and raising "Memory" to 12GiB or more.

If you want to run the tests GPU(s) enabled, see the --use-gpu and --nvidia-visible-devices arguments under py.test --help. If your local machine doesn't have an NVIDIA GPU, but you have access to one that does via SSH, you can use the test_remote_gpu.sh script to run the tests with GPU enabled over SSH.

Docs

To generate the docs, install the docs requirements:

pip install -r docs/requirements.txt

Since doc structure is auto-generated from the library, you must have the library (and all its dependencies) installed as well.

Then, run the following from the repository root:

./generate_docs.sh

Then browse the generated documentation in docs/_build/html.

Attribution

gobbli wouldn't exist without the public release of several state-of-the-art models. The library incorporates:

Original work on the library was funded by RTI International.

Logo design by Marcia Underwood.

More Repositories

1

SMART

Smarter Manual Annotation for Resource-constrained collection of Training data
Python
206
star
2

harness-vue

JavaScript
10
star
3

harness

The Harness vue plugin
JavaScript
8
star
4

rollmatch

Rolling Entry Matching R Package
R
7
star
5

teehr

Tools for Exploratory Evaluation in Hydrologic Research
Python
6
star
6

code_docker_lib

Dockerized tools for the Center for Omics Discovery and Epidemiology
Dockerfile
5
star
7

PushshiftRedditDistiller

This package is intended to assist with downloading, extracting, and distilling the monthly reddit data dumps made available through pushshift.io
Julia
4
star
8

hydro-evaluation

Test code for the CIROH Evaluation System project.
Jupyter Notebook
4
star
9

biocloud_docker_tools

C
4
star
10

diabetes-simbackend-only

Python
4
star
11

biocloud_gwas_workflows

WDL
3
star
12

NCMInD

Python
3
star
13

nc-mind-covid-19

Python
2
star
14

mobForest

R
2
star
15

BigSurv18-Spark-for-Social-Science

Jupyter Notebook
2
star
16

harness-starter-template

A starter template for building web dashboards with RTI's Harness Vue Plugins
JavaScript
2
star
17

comprehensive-model-schema

JSON Schema for the comprehensive diabetes model.
2
star
18

harness-ui

Vue
2
star
19

AEG-RTI_H2Models

Python
2
star
20

crcsim

Colorectal cancer (CRC) simulation model, designed to examine the impacts of screening strategies and patient compliance on outcomes like mortality rate.
Python
2
star
21

ld-regression-pipeline

WDL workflow for running LD-regression of GWAS summary statistics against one or more phenotypes on interest
WDL
1
star
22

childcare_lead_BNmodels

Code to develop BN models to predict water lead risks in child care centers.
R
1
star
23

rota-app

Streamlit application for using ROTA
Python
1
star
24

nc-mind

NC MInD Website
HTML
1
star
25

virtual-opioid-user

Continuous model of an individual's opioid user over time
Python
1
star
26

cervical-cancer-abm

Cervical Cancer Prevention Agent-Based Model
Python
1
star
27

LISTS_REDCap_project

LISTS (Longitudinal Implementation Strategy Tracker System) REDCap project
1
star
28

harness-vue-bootstrap

Vue
1
star
29

teehr-may-2023-workshop

Materials for the May 2023 CIROH Developers Conference
Python
1
star
30

rota

Rapid Offense Text Autocoder
Python
1
star
31

researchnet

RTI’s ResearchNet is a flexible, cloud-enabled backend for Computer Assisted Self Interview (CASI) systems. This platform provides a secure mechanism for managing enrollment, processing consent, and collecting survey data.
Python
1
star
32

csv-to-embeddings-model

Trains a model on top of a sbert's pertained models with given trained pairs to be used with Python's Sentence Transformer
Python
1
star