• Stars
    star
    213
  • Rank 185,410 (Top 4 %)
  • Language
    Python
  • Created over 5 years ago
  • Updated about 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A toolkit for evaluating the linguistic knowledge and transferability of contextual representations. Code for "Linguistic Knowledge and Transferability of Contextual Representations" (NAACL 2019).

Build Status codecov

contextual-repr-analysis

A toolkit for evaluating the linguistic knowledge and transferability of contextual word representations. Code for Linguistic Knowledge and Transferability of Contextual Representations, to appear at NAACL 2019.

For a description of the included tasks, see TASKS.md.

Table of Contents

Installation

This project is being developed in Python 3.6, and CI runs the tests in Python 3.6 as well (via TravisCI).

Conda will set up a virtual environment with the exact version of Python used for development along with all the dependencies needed to run the code.

  1. Download and install Conda.

  2. Change your directory to your clone of this repo.

    cd contextual-repr-analysis
  3. Create a Conda environment with Python 3.6 .

    conda create -n contextual_repr_analysis python=3.6
  4. Now activate the Conda environment. You will need to activate the Conda environment in each terminal in which you want to run code from this repo.

    source activate contextual_repr_analysis
  5. Install the required dependencies.

    pip install -r requirements.txt

You should now be able to test your installation with py.test -v. Congratulations!

Getting Started: Evaluating Representations

This section walks through an example of evaluating ELMo on the English Web Treebank (EWT) English POS tagging task.

Step 1. Precomputing the Word Representations

The easiest way to get started with evaluating your representations is to precompute representations for each word in each sentence in the evaluation dataset. In each of the data/ directories, there exists a text file of sentences (newline delimited, and tokens are space-delimited). These are the sentences used during training and evaluation, so getting representations for these should be enough. If you write a new DatasetReader and want to generate these sentences, use the script at ./scripts/get_config_sentences.py (see python ./scripts/get_config_sentences.py -h for more information on usage).

The format of the HDF5 file should be as follows:

  1. The keys should be numbers (represented as strings), corresponding to line numbers.

  2. The value associated with each key is expected to a numpy array of word representations. Acceptable shapes are (sequence_length, representation_dim) or (num_layers, sequence_length, representation_dim).

  3. Another key, the string value "sentence_to_index", should store a string-serialized JSON dictionary mapping from sentences (the sentences that the representations in the values are calculated from) to string numbers (the other keys of the HDF5 file).

If you have a Dict[str, str] called sentence_to_index and a Dict[str, numpy.ndarray] named vectors containing a mapping from str numbers to vectors (a dictionary with consecutive numbers as keys and numpy arrays as values), you can pass the dictionaries into the following function to produce an HDF5 file with the proper format.

def make_hdf5_file(sentence_to_index, vectors):
    with h5py.File(output_file_path, 'w') as fout:
        for key, embeddings in vectors.items():
            fout.create_dataset(
                str(key),
                embeddings.shape, dtype='float32',
                data=embeddings)
        sentence_index_dataset = fout.create_dataset(
            "sentence_to_index",
            (1,),
            dtype=h5py.special_dtype(vlen=str))
        sentence_index_dataset[0] = json.dumps(sentence_to_index)

Step 2. Creating the experiment configuration

./experiment_configs contains all the experiment configurations used in this project. TODO (nfliu): Write more about writing your own experiment config.

Step 3. Training the probing model

To train a probing model on top of the precomputed word representations, we use the allennlp train command.

Given a single configuration file, we can train with:

allennlp train <config_path> -s <model_save_path> --include-package contexteval

For example, for training a contextualizer on the topmost ELMo layer (the default) for POS tagging:

allennlp train experiment_configs/elmo_original/ewt_pos_tagging.json \
    -s ewt_pos_tagging_topmost_layer \
    --include-package contexteval

Note that the precomputed contextualizers in the experiment config do not have a layer specified. This causes models to default to using the topmost layer. To train on, say, the first layer (index 0), you can run this command:

allennlp train experiment_configs/elmo_original/ewt_pos_tagging.json \
    -s models/elmo_original/ewt_pos_tagging_layer_0 --include-package contexteval \
    --overrides '{"dataset_reader": {"contextualizer": {"layer_num": 0}}, "validation_dataset_reader": {"contextualizer": {"layer_num": 0}}}'

To train on all layers, one-by-one, you can wrap the above in a bash for-loop.

for i in 0 1 2; do allennlp train experiment_configs/elmo_original/ewt_pos_tagging.json \
    -s models/elmo_original/ewt_pos_tagging_layer_${i} --include-package contexteval \
    --overrides '{"dataset_reader": {"contextualizer": {"layer_num": '${i}'}}, "validation_dataset_reader": {"contextualizer": {"layer_num": '${i}'}}}'; done

Step 4: Evaluating the probing model on test data

To evaluate a trained probing model on test data, use the allennlp evaluate command.

To evaluate the three models we trained above and log the output to a file, we can run:

for i in 0 1 2; do allennlp evaluate models/elmo_original/ewt_pos_tagging_layer_${i}/model.tar.gz \
    --evaluation-data-file ./data/pos/en_ewt-ud-test.conllu --cuda-device 0 \
    --include-package contexteval 2>&1 | tee models/elmo_original/ewt_pos_tagging_layer_${i}/evaluation.log; done

References

@InProceedings{liu-gardner-belinkov-peters-smith:2019:NAACL,
  author    = {Liu, Nelson F.  and  Gardner, Matt  and  Belinkov, Yonatan  and  Peters, Matthew E.  and  Smith, Noah A.},
  title     = {Linguistic Knowledge and Transferability of Contextual Representations},
  booktitle = {Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
  year      = {2019}
}

More Repositories

1

paraphrase-id-tensorflow

Various models and code (Manhattan LSTM, Siamese LSTM + Matching Layer, BiMPM) for the paraphrase identification task, specifically with the Quora Question Pairs dataset.
Python
326
star
2

pytorch-manylinux-binaries

Shell
103
star
3

evaluating-verifiability-in-generative-search-engines

Companion repo for "Evaluating Verifiability in Generative Search Engines".
Python
76
star
4

lost-in-the-middle

Code and data for "Lost in the Middle: How Language Models Use Long Contexts"
Python
72
star
5

cython-crash-course

A quick intro to Cython for Python users who don't know C
Jupyter Notebook
32
star
6

flatten_gigaword

Dump the text of the Gigaword dataset into a single file, for use with language modeling (and other!) toolkits
Python
24
star
7

inoculation-by-finetuning

Code for the paper "Inoculation by Fine-Tuning: A Method for Analyzing Challenge Datasets", to be presented at NAACL 2019.
Jsonnet
18
star
8

lexical-semantic-recognition

Python
16
star
9

pytorch-paper-classifier

A simple model for classifying papers by academic venue (AI/ML/ACL), given a title and abstract. Bare-metal PyTorch port of https://github.com/allenai/allennlp-as-a-library-example .
Python
12
star
10

ASLSpeak

🎤 DubHacks 2015 project. Decode sign language using the Leap Motion, and speak it!
Python
9
star
11

website

HTML
9
star
12

MyoDrone

🚁 Controlling a Parrot AR 2 Drone with Thalmic Labs Myo. PennApps 2015 Spring.
JavaScript
7
star
13

word2color

Given a description of a color, return its closest standard HTML4 color.
Python
7
star
14

nlp-phd-vis

Visualizations on NLP PhD applications
Jupyter Notebook
4
star
15

BitStation-App

💸 The MIT Kerberos-integrated social wallet. Winner of BitComp 2014 Improving MIT Award.
Ruby
3
star
16

talks_and_tutorials

Repository for materials for informal and slightly less informal talks and tutorials
HTML
3
star
17

LSTMs-exploit-linguistic-attributes

Code for the paper "LSTMs Exploit Linguistic Attributes of Data", presented at the ACL 2018 Workshop on Representation Learning for NLP.
Python
3
star
18

quoref-annotation

HTML
2
star
19

SMSiri

☎️ Built @ MHacks 6 -- An SMS based natural language question and answer system.
Python
2
star
20

phonesthemes

Code for the paper "Discovering Phonesthemes with Sparse Regularization", to be presented at the NAACL 2018 Workshop on Subword and Character Level Models in NLP.
Python
2
star
21

oov-translation

Code accompanying "Augmenting Statistical Machine Translation with Subword Translation of Out-of-Vocabulary Words"
Python
1
star
22

pytorch-pretrained-bert-feedstock

A conda-smithy repository for pytorch-pretrained-bert.
Shell
1
star
23

ical2org

Convert iCal-format calendars into a file of Emacs org-mode entries
Python
1
star