• Stars
    star
    4,374
  • Rank 9,311 (Top 0.2 %)
  • Language
    Python
  • License
    Other
  • Created almost 7 years ago
  • Updated 11 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Reading Wikipedia to Answer Open-Domain Questions

DrQA

This is a PyTorch implementation of the DrQA system described in the ACL 2017 paper Reading Wikipedia to Answer Open-Domain Questions.

Quick Links

Machine Reading at Scale

DrQA is a system for reading comprehension applied to open-domain question answering. In particular, DrQA is targeted at the task of "machine reading at scale" (MRS). In this setting, we are searching for an answer to a question in a potentially very large corpus of unstructured documents (that may not be redundant). Thus the system has to combine the challenges of document retrieval (finding the relevant documents) with that of machine comprehension of text (identifying the answers from those documents).

Our experiments with DrQA focus on answering factoid questions while using Wikipedia as the unique knowledge source for documents. Wikipedia is a well-suited source of large-scale, rich, detailed information. In order to answer any question, one must first retrieve the few potentially relevant articles among more than 5 million, and then scan them carefully to identify the answer.

Note that DrQA treats Wikipedia as a generic collection of articles and does not rely on its internal graph structure. As a result, DrQA can be straightforwardly applied to any collection of documents, as described in the retriever README.

This repository includes code, data, and pre-trained models for processing and querying Wikipedia as described in the paper -- see Trained Models and Data. We also list several different datasets for evaluation, see QA Datasets. Note that this work is a refactored and more efficient version of the original code. Reproduction numbers are very similar but not exact.

Quick Start: Demo

Install DrQA and download our models to start asking open-domain questions!

Run python scripts/pipeline/interactive.py to drop into an interactive session. For each question, the top span and the Wikipedia paragraph it came from are returned.

>>> process('What is question answering?')

Top Predictions:
+------+----------------------------------------------------------------------------------------------------------+--------------------+--------------+-----------+
| Rank |                                                  Answer                                                  |        Doc         | Answer Score | Doc Score |
+------+----------------------------------------------------------------------------------------------------------+--------------------+--------------+-----------+
|  1   | a computer science discipline within the fields of information retrieval and natural language processing | Question answering |    1917.8    |   327.89  |
+------+----------------------------------------------------------------------------------------------------------+--------------------+--------------+-----------+

Contexts:
[ Doc = Question answering ]
Question Answering (QA) is a computer science discipline within the fields of
information retrieval and natural language processing (NLP), which is
concerned with building systems that automatically answer questions posed by
humans in a natural language.
>>> process('What is the answer to life, the universe, and everything?')

Top Predictions:
+------+--------+---------------------------------------------------+--------------+-----------+
| Rank | Answer |                        Doc                        | Answer Score | Doc Score |
+------+--------+---------------------------------------------------+--------------+-----------+
|  1   |   42   | Phrases from The Hitchhiker's Guide to the Galaxy |    47242     |   141.26  |
+------+--------+---------------------------------------------------+--------------+-----------+

Contexts:
[ Doc = Phrases from The Hitchhiker's Guide to the Galaxy ]
The number 42 and the phrase, "Life, the universe, and everything" have
attained cult status on the Internet. "Life, the universe, and everything" is
a common name for the off-topic section of an Internet forum and the phrase is
invoked in similar ways to mean "anything at all". Many chatbots, when asked
about the meaning of life, will answer "42". Several online calculators are
also programmed with the Question. Google Calculator will give the result to
"the answer to life the universe and everything" as 42, as will Wolfram's
Computational Knowledge Engine. Similarly, DuckDuckGo also gives the result of
"the answer to the ultimate question of life, the universe and everything" as
42. In the online community Second Life, there is a section on a sim called
43. "42nd Life." It is devoted to this concept in the book series, and several
attempts at recreating Milliways, the Restaurant at the End of the Universe, were made.
>>> process('Who was the winning pitcher in the 1956 World Series?')

Top Predictions:
+------+------------+------------------+--------------+-----------+
| Rank |   Answer   |       Doc        | Answer Score | Doc Score |
+------+------------+------------------+--------------+-----------+
|  1   | Don Larsen | New York Yankees |  4.5059e+06  |   278.06  |
+------+------------+------------------+--------------+-----------+

Contexts:
[ Doc = New York Yankees ]
In 1954, the Yankees won over 100 games, but the Indians took the pennant with
an AL record 111 wins; 1954 was famously referred to as "The Year the Yankees
Lost the Pennant". In , the Dodgers finally beat the Yankees in the World
Series, after five previous Series losses to them, but the Yankees came back
strong the next year. On October 8, 1956, in Game Five of the 1956 World
Series against the Dodgers, pitcher Don Larsen threw the only perfect game in
World Series history, which remains the only perfect game in postseason play
and was the only no-hitter of any kind to be pitched in postseason play until
Roy Halladay pitched a no-hitter on October 6, 2010.

Try some of your own! Of course, DrQA might provide alternative facts, so enjoy the ride.

Installing DrQA

Setting up DrQA is easy!

DrQA requires Linux/OSX and Python 3.5 or higher. It also requires installing PyTorch version 1.0. Its other dependencies are listed in requirements.txt. CUDA is strongly recommended for speed, but not necessary.

Run the following commands to clone the repository and install DrQA:

git clone https://github.com/facebookresearch/DrQA.git
cd DrQA; pip install -r requirements.txt; python setup.py develop

Note: requirements.txt includes a subset of all the possible required packages. Depending on what you want to run, you might need to install an extra package (e.g. spacy).

If you use the CoreNLPTokenizer or SpacyTokenizer you also need to download the Stanford CoreNLP jars and spaCy en model, respectively. If you use Stanford CoreNLP, have the jars in your java CLASSPATH environment variable, or set the path programmatically with:

import drqa.tokenizers
drqa.tokenizers.set_default('corenlp_classpath', '/your/corenlp/classpath/*')

IMPORTANT: The default tokenizer is CoreNLP so you will need that in your CLASSPATH to run the README examples.

Ex: export CLASSPATH=$CLASSPATH:/path/to/corenlp/download/*.

If you do not already have a CoreNLP download you can run:

./install_corenlp.sh

Verify that it runs:

from drqa.tokenizers import CoreNLPTokenizer
tok = CoreNLPTokenizer()
tok.tokenize('hello world').words()  # Should complete immediately

For convenience, the Document Reader, Retriever, and Pipeline modules will try to load default models if no model argument is given. See below for downloading these models.

Trained Models and Data

To download all provided trained models and data for Wikipedia question answering, run:

./download.sh

Warning: this downloads a 7.5GB tarball (25GB untarred) and will take some time.

This stores the data in data/ at the file paths specified in the various modules' defaults. This top-level directory can be modified by setting a DRQA_DATA environment variable to point to somewhere else.

Default directory structure (see embeddings for more info on additional downloads for training):

DrQA
β”œβ”€β”€ data (or $DRQA_DATA)
 Β Β  β”œβ”€β”€ datasets
  Β Β β”‚Β Β  β”œβ”€β”€ SQuAD-v1.1-<train/dev>.<txt/json>
    β”‚Β Β  β”œβ”€β”€ WebQuestions-<train/test>.txt
    β”‚Β Β  β”œβ”€β”€ freebase-entities.txt
 Β Β  β”‚Β Β  β”œβ”€β”€ CuratedTrec-<train/test>.txt
 Β Β  β”‚Β Β  └── WikiMovies-<train/test/entities>.txt
 Β Β  β”œβ”€β”€ reader
 Β Β  β”‚Β Β  β”œβ”€β”€ multitask.mdl
 Β Β  β”‚Β Β  └── single.mdl
 Β Β  └── wikipedia
 Β Β      β”œβ”€β”€ docs.db
 Β Β      └── docs-tfidf-ngram=2-hash=16777216-tokenizer=simple.npz

Default model paths for the different modules can also be modified programmatically in the code, e.g.:

import drqa.reader
drqa.reader.set_default('model', '/path/to/model')
reader = drqa.reader.Predictor()  # Default model loaded for prediction

Document Retriever

TF-IDF model using Wikipedia (unigrams and bigrams, 2^24 bins, simple tokenization), evaluated on multiple datasets (test sets, dev set for SQuAD):

Model SQuAD P@5 CuratedTREC P@5 WebQuestions P@5 WikiMovies P@5 Size
TF-IDF model 78.0 87.6 75.0 69.8 ~13GB

P@5 here is defined as the % of questions for which the answer segment appears in one of the top 5 documents.

Document Reader

Model trained only on SQuAD, evaluated in the SQuAD setting:

Model SQuAD Dev EM SQuAD Dev F1 Size
Single model 69.4 78.9 ~130MB

Model trained with distant supervision without NER/POS/lemma features, evaluated on multiple datasets (test sets, dev set for SQuAD) in the full Wikipedia setting:

Model SQuAD EM CuratedTREC EM WebQuestions EM WikiMovies EM Size
Multitask model 29.5 27.2 18.5 36.9 ~270MB

Wikipedia

Our full-scale experiments were conducted on the 2016-12-21 dump of English Wikipedia. The dump was processed with the WikiExtractor and filtered for internal disambiguation, list, index, and outline pages (pages that are typically just links). We store the documents in an sqlite database for which drqa.retriever.DocDB provides an interface.

Database Num. Documents Size
Wikipedia 5,075,182 ~13GB

QA Datasets

The datasets used for DrQA training and evaluation can be found here:

Format A

The retriever/eval.py, pipeline/eval.py, and distant/generate.py scripts expect the datasets as a .txt file where each line is a JSON encoded QA pair, like so:

'{"question": "q1", "answer": ["a11", ..., "a1i"]}'
...
'{"question": "qN", "answer": ["aN1", ..., "aNi"]}'

Scripts to convert SQuAD and WebQuestions to this format are included in scripts/convert. This is automatically done in download.sh.

Format B

The reader directory scripts expect the datasets as a .json file where the data is arranged like SQuAD:

file.json
β”œβ”€β”€ "data"
β”‚Β Β  └── [i]
β”‚Β Β      β”œβ”€β”€ "paragraphs"
β”‚Β Β      β”‚Β Β  └── [j]
β”‚Β Β      β”‚Β Β      β”œβ”€β”€ "context": "paragraph text"
β”‚Β Β      β”‚Β Β      └── "qas"
β”‚Β Β      β”‚Β Β          └── [k]
β”‚Β Β      β”‚Β Β              β”œβ”€β”€ "answers"
β”‚Β Β      β”‚Β Β              β”‚Β Β  └── [l]
β”‚Β Β      β”‚Β Β              β”‚Β Β      β”œβ”€β”€ "answer_start": N
β”‚Β Β      β”‚Β Β              β”‚Β Β      └── "text": "answer"
β”‚Β Β      β”‚Β Β              β”œβ”€β”€ "id": "<uuid>"
β”‚Β Β      β”‚Β Β              └── "question": "paragraph question?"
β”‚Β Β      └── "title": "document id"
└── "version": 1.1
Entity lists

Some datasets have (potentially large) candidate lists for selecting answers. For example, WikiMovies' answers are OMDb entries while WebQuestions is based on Freebase. If we have known candidates, we can impose that all predicted answers must be in this list by discarding any higher scoring spans that are not.

DrQA Components

Document Retriever

DrQA is not tied to any specific type of retrieval system -- as long as it effectively narrows the search space and focuses on relevant documents.

Following classical QA systems, we include an efficient (non-machine learning) document retrieval system based on sparse, TF-IDF weighted bag-of-word vectors. We use bags of hashed n-grams (here, unigrams and bigrams).

To see how to build your own such model on new documents, see the retriever README.

To interactively query Wikipedia:

python scripts/retriever/interactive.py --model /path/to/model

If model is left out our default model will be used (assuming it was downloaded).

To evaluate the retriever accuracy (% match in top 5) on a dataset:

python scripts/retriever/eval.py /path/to/format/A/dataset.txt --model /path/to/model

Document Reader

DrQA's Document Reader is a multi-layer recurrent neural network machine comprehension model trained to do extractive question answering. That is, the model tries to find the answer to any question as a text span in one of the returned documents.

The Document Reader was inspired by, and primarily trained on, the SQuAD dataset. It can also be used standalone on such SQuAD-like tasks where a specific context is supplied with the question, the answer to which is contained in the context.

To see how to train the Document Reader on SQuAD, see the reader README.

To interactively ask questions about text with a trained model:

python scripts/reader/interactive.py --model /path/to/model

Again, here model is optional; a default model will be used if it is left out.

To run model predictions on a dataset:

python scripts/reader/predict.py /path/to/format/B/dataset.json --model /path/to/model

DrQA Pipeline

The full system is linked together in drqa.pipeline.DrQA.

To interactively ask questions using the full DrQA:

python scripts/pipeline/interactive.py

Optional arguments:

--reader-model    Path to trained Document Reader model.
--retriever-model Path to Document Retriever model (tfidf).
--doc-db          Path to Document DB.
--tokenizer      String option specifying tokenizer type to use (e.g. 'corenlp').
--candidate-file  List of candidates to restrict predictions to, one candidate per line.
--no-cuda         Use CPU only.
--gpu             Specify GPU device id to use.

To run predictions on a dataset:

python scripts/pipeline/predict.py /path/to/format/A/dataset.txt

Optional arguments:

--out-dir             Directory to write prediction file to (<dataset>-<model>-pipeline.preds).
--reader-model        Path to trained Document Reader model.
--retriever-model     Path to Document Retriever model (tfidf).
--doc-db              Path to Document DB.
--embedding-file      Expand dictionary to use all pretrained embeddings in this file (e.g. all glove vectors to minimize UNKs at test time).
--candidate-file      List of candidates to restrict predictions to, one candidate per line.
--n-docs              Number of docs to retrieve per query.
--top-n               Number of predictions to make per query.
--tokenizer           String option specifying tokenizer type to use (e.g. 'corenlp').
--no-cuda             Use CPU only.
--gpu                 Specify GPU device id to use.
--parallel            Use data parallel (split across GPU devices).
--num-workers         Number of CPU processes (for tokenizing, etc).
--batch-size          Document paragraph batching size (Reduce in case of GPU OOM).
--predict-batch-size  Question batching size (Reduce in case of CPU OOM).

Distant Supervision (DS)

DrQA's performance improves significantly in the full-setting when provided with distantly supervised data from additional datasets. Given question-answer pairs but no supporting context, we can use string matching heuristics to automatically associate paragraphs to these training examples.

Question: What U.S. state’s motto is β€œLive free or Die”?

Answer: New Hampshire

DS Document: Live Free or Die β€œLive Free or Die” is the official motto of the U.S. state of New Hampshire, adopted by the state in 1945. It is possibly the best-known of all state mottos, partly because it conveys an assertive independence historically found in American political philosophy and partly because of its contrast to the milder sentiments found in other state mottos.

The scripts/distant directory contains code to generate and inspect such distantly supervised data. More information can be found in the distant supervision README.

Tokenizers

We provide a number of different tokenizer options for convenience. Each has its own pros/cons based on how many dependencies it requires, overhead for running it, speed, and performance. For our reported experiments we used CoreNLP (but results are all similar).

Available tokenizers:

  • CoreNLPTokenizer: Uses Stanford CoreNLP (option: 'corenlp'). We used v3.7.0. Requires Java 8.
  • SpacyTokenizer: Uses spaCy (option: 'spacy').
  • RegexpTokenizer: Custom regex-based PTB-style tokenizer (option: 'regexp').
  • SimpleTokenizer: Basic alpha-numeric/non-whitespace tokenizer (option: 'simple').

See the list of mappings between string option names and tokenizer classes.

Citation

Please cite the ACL paper if you use DrQA in your work:

@inproceedings{chen2017reading,
  title={Reading {Wikipedia} to Answer Open-Domain Questions},
  author={Chen, Danqi and Fisch, Adam and Weston, Jason and Bordes, Antoine},
  booktitle={Association for Computational Linguistics (ACL)},
  year={2017}
}

DrQA Elsewhere

Connection with ParlAI

This implementation of the DrQA Document Reader is closely related to the one found in ParlAI. Here, however, the work is extended to interact with the Document Retriever in the open-domain setting. On the other hand, the implementation in ParlAI is more general, and follows the appropriate API to work in more QA/Dialog settings.

Web UI

Hamed Zaghaghi has provided a wrapper for a Web UI.

License

DrQA is BSD-licensed.

More Repositories

1

llama

Inference code for LLaMA models
Python
44,989
star
2

segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Jupyter Notebook
42,134
star
3

Detectron

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.
Python
25,771
star
4

fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Python
25,718
star
5

detectron2

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Python
25,567
star
6

fastText

Library for fast text representation and classification.
HTML
24,973
star
7

faiss

A library for efficient similarity search and clustering of dense vectors.
C++
24,035
star
8

audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
Python
18,693
star
9

codellama

Inference code for CodeLlama models
Python
13,303
star
10

detr

End-to-End Object Detection with Transformers
Python
11,076
star
11

ParlAI

A framework for training and evaluating AI models on a variety of openly available dialogue datasets.
Python
10,085
star
12

seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation
Jupyter Notebook
9,653
star
13

maskrcnn-benchmark

Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch.
Python
9,104
star
14

pifuhd

High-Resolution 3D Human Digitization from A Single Image.
Python
8,923
star
15

hydra

Hydra is a framework for elegantly configuring complex applications
Python
8,044
star
16

AnimatedDrawings

Code to accompany "A Method for Animating Children's Drawings of the Human Figure"
Python
8,032
star
17

ImageBind

ImageBind One Embedding Space to Bind Them All
Python
7,630
star
18

nougat

Implementation of Nougat Neural Optical Understanding for Academic Documents
Python
7,568
star
19

llama-recipes

Scripts for fine-tuning Llama2 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization & question answering. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment.Demo apps to showcase Llama2 for WhatsApp & Messenger
Jupyter Notebook
7,402
star
20

pytorch3d

PyTorch3D is FAIR's library of reusable components for deep learning with 3D data
Python
7,322
star
21

dinov2

PyTorch code and models for the DINOv2 self-supervised learning method.
Jupyter Notebook
7,278
star
22

DensePose

A real-time approach for mapping all human pixels of 2D RGB images to a 3D surface-based model of the body
Jupyter Notebook
6,547
star
23

pytext

A natural language modeling framework based on PyTorch
Python
6,357
star
24

metaseq

Repo for external large-scale work
Python
5,947
star
25

demucs

Code for the paper Hybrid Spectrogram and Waveform Source Separation
Python
5,886
star
26

SlowFast

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
Python
5,678
star
27

mae

PyTorch implementation of MAE https//arxiv.org/abs/2111.06377
Python
5,495
star
28

mmf

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
Python
5,235
star
29

ConvNeXt

Code release for ConvNeXt model
Python
4,971
star
30

dino

PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO
Python
4,830
star
31

DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
Python
4,761
star
32

AugLy

A data augmentations library for audio, image, text, and video.
Python
4,739
star
33

Kats

Kats, a kit to analyze time series data, a lightweight, easy-to-use, generalizable, and extendable framework to perform time series analysis, from understanding the key statistics and characteristics, detecting change points and anomalies, to forecasting future trends.
Python
4,387
star
34

xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.
Python
4,191
star
35

moco

PyTorch implementation of MoCo: https://arxiv.org/abs/1911.05722
Python
4,035
star
36

StarSpace

Learning embeddings for classification, retrieval and ranking.
C++
3,856
star
37

fairseq-lua

Facebook AI Research Sequence-to-Sequence Toolkit
Lua
3,765
star
38

nevergrad

A Python toolbox for performing gradient-free optimization
Python
3,446
star
39

deit

Official DeiT repository
Python
3,425
star
40

dlrm

An implementation of a deep learning recommendation model (DLRM)
Python
3,417
star
41

ReAgent

A platform for Reasoning systems (Reinforcement Learning, Contextual Bandits, etc.)
Python
3,395
star
42

LASER

Language-Agnostic SEntence Representations
Python
3,308
star
43

VideoPose3D

Efficient 3D human pose estimation in video using 2D keypoint trajectories
Python
3,294
star
44

PyTorch-BigGraph

Generate embeddings from large-scale graph-structured data.
Python
3,238
star
45

deepmask

Torch implementation of DeepMask and SharpMask
Lua
3,113
star
46

MUSE

A library for Multilingual Unsupervised or Supervised word Embeddings
Python
3,094
star
47

vissl

VISSL is FAIR's library of extensible, modular and scalable components for SOTA Self-Supervised Learning with images.
Jupyter Notebook
3,038
star
48

pytorchvideo

A deep learning library for video understanding research.
Python
2,885
star
49

XLM

PyTorch original implementation of Cross-lingual Language Model Pretraining.
Python
2,763
star
50

hiplot

HiPlot makes understanding high dimensional data easy
TypeScript
2,481
star
51

ijepa

Official codebase for I-JEPA, the Image-based Joint-Embedding Predictive Architecture. First outlined in the CVPR paper, "Self-supervised learning from images with a joint-embedding predictive architecture."
Python
2,381
star
52

fairscale

PyTorch extensions for high performance and large scale training.
Python
2,319
star
53

audio2photoreal

Code and dataset for photorealistic Codec Avatars driven from audio
Python
2,316
star
54

encodec

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
Python
2,313
star
55

habitat-sim

A flexible, high-performance 3D simulator for Embodied AI research.
C++
2,299
star
56

InferSent

InferSent sentence embeddings
Jupyter Notebook
2,264
star
57

co-tracker

CoTracker is a model for tracking any point (pixel) on a video.
Jupyter Notebook
2,240
star
58

Pearl

A Production-ready Reinforcement Learning AI Agent Library brought by the Applied Reinforcement Learning team at Meta.
Python
2,193
star
59

pyrobot

PyRobot: An Open Source Robotics Research Platform
Python
2,109
star
60

darkforestGo

DarkForest, the Facebook Go engine.
C
2,108
star
61

ELF

An End-To-End, Lightweight and Flexible Platform for Game Research
C++
2,089
star
62

pycls

Codebase for Image Classification Research, written in PyTorch.
Python
2,053
star
63

esm

Evolutionary Scale Modeling (esm): Pretrained language models for proteins
Python
2,026
star
64

frankmocap

A Strong and Easy-to-use Single View 3D Hand+Body Pose Estimator
Python
1,972
star
65

video-nonlocal-net

Non-local Neural Networks for Video Classification
Python
1,931
star
66

SentEval

A python tool for evaluating the quality of sentence embeddings.
Python
1,930
star
67

ResNeXt

Implementation of a classification framework from the paper Aggregated Residual Transformations for Deep Neural Networks
Lua
1,863
star
68

SparseConvNet

Submanifold sparse convolutional networks
C++
1,847
star
69

swav

PyTorch implementation of SwAV https//arxiv.org/abs/2006.09882
Python
1,790
star
70

TensorComprehensions

A domain specific language to express machine learning workloads.
C++
1,747
star
71

Mask2Former

Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"
Python
1,638
star
72

habitat-lab

A modular high-level library to train embodied AI agents across a variety of tasks and environments.
Python
1,636
star
73

fvcore

Collection of common code that's shared among different research projects in FAIR computer vision team.
Python
1,623
star
74

TransCoder

Public release of the TransCoder research project https://arxiv.org/pdf/2006.03511.pdf
Python
1,611
star
75

poincare-embeddings

PyTorch implementation of the NIPS-17 paper "PoincarΓ© Embeddings for Learning Hierarchical Representations"
Python
1,587
star
76

votenet

Deep Hough Voting for 3D Object Detection in Point Clouds
Python
1,563
star
77

pytorch_GAN_zoo

A mix of GAN implementations including progressive growing
Python
1,554
star
78

ClassyVision

An end-to-end PyTorch framework for image and video classification
Python
1,552
star
79

deepcluster

Deep Clustering for Unsupervised Learning of Visual Features
Python
1,544
star
80

higher

higher is a pytorch library allowing users to obtain higher order gradients over losses spanning training loops rather than individual training steps.
Python
1,524
star
81

UnsupervisedMT

Phrase-Based & Neural Unsupervised Machine Translation
Python
1,496
star
82

consistent_depth

We estimate dense, flicker-free, geometrically consistent depth from monocular video, for example hand-held cell phone video.
Python
1,479
star
83

Detic

Code release for "Detecting Twenty-thousand Classes using Image-level Supervision".
Python
1,446
star
84

end-to-end-negotiator

Deal or No Deal? End-to-End Learning for Negotiation Dialogues
Python
1,368
star
85

multipathnet

A Torch implementation of the object detection network from "A MultiPath Network for Object Detection" (https://arxiv.org/abs/1604.02135)
Lua
1,349
star
86

CommAI-env

A platform for developing AI systems as described in A Roadmap towards Machine Intelligence - http://arxiv.org/abs/1511.08130
1,324
star
87

theseus

A library for differentiable nonlinear optimization
Python
1,306
star
88

ConvNeXt-V2

Code release for ConvNeXt V2 model
Python
1,300
star
89

DPR

Dense Passage Retriever - is a set of tools and models for open domain Q&A task.
Python
1,292
star
90

CrypTen

A framework for Privacy Preserving Machine Learning
Python
1,283
star
91

denoiser

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.
Python
1,272
star
92

DeepSDF

Learning Continuous Signed Distance Functions for Shape Representation
Python
1,191
star
93

TimeSformer

The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"
Python
1,172
star
94

House3D

a Realistic and Rich 3D Environment
C++
1,167
star
95

MaskFormer

Per-Pixel Classification is Not All You Need for Semantic Segmentation (NeurIPS 2021, spotlight)
Python
1,149
star
96

LAMA

LAnguage Model Analysis
Python
1,104
star
97

fastMRI

A large-scale dataset of both raw MRI measurements and clinical MRI images.
Python
1,098
star
98

meshrcnn

code for Mesh R-CNN, ICCV 2019
Python
1,083
star
99

mixup-cifar10

mixup: Beyond Empirical Risk Minimization
Python
1,073
star
100

DomainBed

DomainBed is a suite to test domain generalization algorithms
Python
1,071
star