• Stars
    star
    272
  • Rank 150,328 (Top 3 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 7 years ago
  • Updated 6 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Code for the TriviaQA reading comprehension dataset

TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension

  • This repo contains code for the paper Mandar Joshi, Eunsol Choi, Daniel Weld, Luke Zettlemoyer.

TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension In Association for Computational Linguistics (ACL) 2017, Vancouver, Canada.

Requirements

General

  • Python 3. You should be able to run the evaluation scripts using Python 2.7 if you take care of unicode in utils.utils.py.
  • BiDAF requires Python 3 -- check the original repository for more details.

Python Packages

  • tensorflow (only if you want to run BiDAF, verified on r0.11)
  • nltk
  • tqdm

Evaluation

The dataset file parameter refers to files in the qa directory of the data (e.g., wikipedia-dev.json). For file format, check out the sample directory in the repo.

python3 -m evaluation.triviaqa_evaluation --dataset_file samples/triviaqa_sample.json --prediction_file samples/sample_predictions.json

Miscellaneous

  • If you have a SQuAD model and want to run on TriviaQA, please refer to utils.convert_to_squad_format.py