allenai/lm-explorer

Stars
127
Rank 282,790 (Top 6 %)
Language
Python
License
Apache License 2.0
Created over 5 years ago
Updated almost 3 years ago

allenai/lm-explorer

allenai

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

interactive explorer for language models

lm-explorer

interactive explorer for language models (currently only OpenAI GPT-2)

Running with Docker

# This creates a local directory where the model can be cached so you don't
# have to download it everytime you execute 'docker run'.
$ mkdir -p /$HOME/.pytorch_pretrained_bert
$ docker build -t lm-explorer:latest .
$ docker run -p 8000:8000 \
    -v /$HOME/.pytorch_pretrained_bert:/root/.pytorch_pretrained_bert \
    -v $(pwd):/local \
    lm-explorer:latest \
    python app.py --port 8000 --dev

Running without Docker

First create and activate a Python 3.6 (or later) virtual environment. Then install the requirements

$ pip install -r requirements.txt

and start the app

$ python app.py --port 8000 --dev

allennlp

An open-source NLP research library, built on PyTorch.

OLMo

Modeling, training, eval, and inference code for OLMo

RL4LMs

A modular RL library to fine-tune language models to human preferences

longformer

Longformer: The Long-Document Transformer

bilm-tf

Tensorflow implementation of contextualized word representations from bi-directional language models

scispacy

A full spaCy pipeline and models for scientific/biomedical documents.

bi-att-flow

Bi-directional Attention Flow (BiDAF) network is a multi-stage hierarchical process that represents context at different levels of granularity and uses a bi-directional attention flow mechanism to achieve a query-aware context representation without early summarization.

scibert

A BERT model for scientific text.

open-instruct

ai2thor

An open-source platform for Visual AI.

dolma

Data and tools for generating and inspecting OLMo pre-training data.

XNOR-Net

ImageNet classification using binary Convolutional Neural Networks

s2orc

S2ORC: The Semantic Scholar Open Research Corpus: https://www.aclweb.org/anthology/2020.acl-main.447/

mmc4

MultimodalC4 is a multimodal extension of c4 that interleaves millions of images with text.

scitldr

objaverse-xl

🪐 Objaverse-XL is a Universe of 10M+ 3D Objects. Contains API Scripts for Downloading and Processing!

papermage

library supporting NLP and CV research on scientific papers

natural-instructions

Expanding natural instructions

visprog

Official code for VisProg (CVPR 2023 Best Paper!)

science-parse

Science Parse parses scientific papers (in PDF form) and returns them in structured form.

pdffigures2

Given a scholarly PDF, extract figures, tables, captions, and section titles.

writing-code-for-nlp-research-emnlp2018

A companion repository for the "Writing code for NLP Research" Tutorial at EMNLP 2018

tango

Organize your experiments into discrete steps that can be cached and reused throughout the lifetime of your research project.

allennlp-models

Officially supported AllenNLP models

specter

SPECTER: Document-level Representation Learning using Citation-informed Transformers

dont-stop-pretraining

Code associated with the Don't Stop Pretraining ACL 2020 paper

unified-io-2

macaw

Multi-angle c(q)uestion answering

lumos

Code and data for "Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs"

document-qa

scholarphi

An interactive PDF reader.

deep_qa

A deep NLP library, based on Keras / tf, focused on question answering (but useful for other NLP too)

acl2018-semantic-parsing-tutorial

Materials from the ACL 2018 tutorial on neural semantic parsing

unifiedqa

UnifiedQA: Crossing Format Boundaries With a Single QA System

pawls

Software that makes labeling PDFs easy.

OLMoE

OLMoE: Open Mixture-of-Experts Language Models

Jupyter Notebook

kb

KnowBert -- Knowledge Enhanced Contextual Word Representations

PeerRead

Data and code for Kang et al., NAACL 2018's paper titled "A Dataset of Peer Reviews (PeerRead): Collection, Insights and NLP Applications"

reward-bench

RewardBench: the first evaluation tool for reward models.

naacl2021-longdoc-tutorial

openie-standalone

Quality information extraction at web scale. Edit

Holodeck

CVPR 2024: Language Guided Generation of 3D Embodied AI Environments.

python-package-template

A template repo for Python packages

allenact

An open source framework for research in Embodied-AI from AI2.

ir_datasets

Provides a common interface to many IR ranking datasets.

s2orc-doc2json

Parsers for scientific papers (PDF2JSON, TEX2JSON, JATS2JSON)

acl2022-zerofewshot-tutorial

OLMo-Eval

Evaluation suite for LLMs

procthor

🏘️ Scaling Embodied AI by Procedurally Generating Interactive 3D Houses

fm-cheatsheet

Website for hosting the Open Foundation Models Cheat Sheet.

FineGrainedRLHF

beaker-cli

A collaborative platform for rapid and reproducible research.

comet-atomic-2020

spv2

Science-parse version 2

scifact

Data and models for the SciFact verification task.

objaverse-rendering

📷 Scripts for rendering Objaverse

ScienceWorld

ScienceWorld is a text-based virtual environment centered around accomplishing tasks from the standardized elementary science curriculum.

unified-io-inference

Jupyter Notebook

allennlp-demo

Code for the AllenNLP demo.

citeomatic

A citation recommendation system that allows users to find relevant citations for their paper drafts. The tool is backed by Semantic Scholar's OpenCorpus dataset.

Jupyter Notebook

cartography

Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics

Jupyter Notebook

savn

Learning to Learn how to Learn: Self-Adaptive Visual Navigation using Meta-Learning (https://arxiv.org/abs/1812.00971)

vampire

Variational Methods for Pretraining in Resource-limited Environments

vila

Incorporating VIsual LAyout Structures for Scientific Text Classification

s2-folks

Public space for the user community of Semantic Scholar APIs to share scripts, report issues, and make suggestions.

hidden-networks

cord19

Get started with CORD-19

mmda

multimodal document analysis

Jupyter Notebook

PRIMER

The official code for PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

catwalk

This project studies the performance and robustness of language models and task-adaptation methods.

dnw

Discovering Neural Wirings (https://arxiv.org/abs/1906.00586)

deepfigures-open

Companion code to the paper "Extracting Scientific Figures with Distantly Supervised Neural Networks" 🤖

tpu_pretrain

LM Pretraining with PyTorch/TPU

allentune

Hyperparameter Search for AllenNLP

SciREX

Data/Code Repository for https://api.semanticscholar.org/CorpusID:218470122

scidocs

Dataset accompanying the SPECTER model

pdffigures

Command line tool to extract figures, tables, and captions from scholarly documents in PDF form.

OpenBookQA

Code for experiments on OpenBookQA from the EMNLP 2018 paper "Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering"

peS2o

Pretraining Efficiently on S2ORC!

gooaq

Question-answers, collected from Google

allennlp-as-a-library-example

A simple example for how to build your own model using AllenNLP as a dependency.

embodied-clip

Official codebase for EmbCLIP

multimodalqa

alexafsm

With alexafsm, developers can model dialog agents with first-class concepts such as states, attributes, transition, and actions. alexafsm also provides visualization and other tools to help understand, test, debug, and maintain complex FSM conversations.

allennlp-semparse

A framework for building semantic parsers (including neural module networks) with AllenNLP, built by the authors of AllenNLP

scicite

Repository for NAACL 2019 paper on Citation Intent prediction

ai2thor-rearrangement

🔀 Visual Room Rearrangement

commonsense-kg-completion

medicat

Dataset of medical images, captions, subfigure-subcaption annotations, and inline textual references

real-toxicity-prompts

Jupyter Notebook

s2search

The Semantic Scholar Search Reranker

aristo-mini

Aristo mini is a light-weight question answering system that can quickly evaluate Aristo science questions with an evaluation web server and the provided baseline solvers.

gpv-1

A task-agnostic vision-language architecture as a step towards General Purpose Vision

Jupyter Notebook

flex

Few-shot NLP benchmark for unified, rigorous eval

elastic

manipulathor

ManipulaTHOR, a framework that facilitates visual manipulation of objects using a robotic arm

Jupyter Notebook

spoc-robot-training

SPOC: Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World

S2AND

Semantic Scholar's Author Disambiguation Algorithm & Evaluation Suite

propara

ProPara (Process Paragraph Comprehension) dataset and models

ARC-Solvers

ARC Question Solvers