• Stars
    star
    334
  • Rank 121,993 (Top 3 %)
  • Language
    Python
  • License
    MIT License
  • Created over 8 years ago
  • Updated over 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Implementation of Dynamic memory networks by Kumar et al. http://arxiv.org/abs/1506.07285

Dynamic memory networks in Theano

The aim of this repository is to implement Dynamic memory networks as described in the paper by Kumar et al. and to experiment with its various extensions.

Pretrained models on bAbI tasks can be tested online.

We will cover the process in a series of blog posts.

Repository contents

file description
main.py the main entry point to train and test available network architectures on bAbI-like tasks
dmn_basic.py our baseline implementation. It is as close to the original as we could understand the paper, except the number of steps in the main memory GRU is fixed. Attention module uses T.abs_ function as a distance between two vectors which causes gradients to become NaN randomly. The results reported in this blog post are based on this network
dmn_smooth.py uses the square of the Euclidean distance instead of abs in the attention module. Training is very stable. Performance on bAbI is slightly better
dmn_batch.py dmn_smooth with minibatch training support. The batch size cannot be set to 1 because of the Theano bug
dmn_qa_draft.py draft version of a DMN designed for answering multiple choice questions
utils.py tools for working with bAbI tasks and GloVe vectors
nn_utils.py helper functions on top of Theano and Lasagne
fetch_babi_data.sh shell script to fetch bAbI tasks (adapted from MemN2N)
fetch_glove_data.sh shell script to fetch GloVe vectors (by 5vision)
server/ contains Flask-based restful api server

Usage

This implementation is based on Theano and Lasagne. One way to install them is:

pip install -r https://raw.githubusercontent.com/Lasagne/Lasagne/master/requirements.txt
pip install https://github.com/Lasagne/Lasagne/archive/master.zip

The following bash scripts will download bAbI tasks and GloVe vectors.

./fetch_babi_data.sh
./fetch_glove_data.sh

Use main.py to train a network:

python main.py --network dmn_basic --babi_id 1

The states of the network will be saved in states/ folder. There is one pretrained state on the 1st bAbI task. It should give 100% accuracy on the test set:

python main.py --network dmn_basic --mode test --babi_id 1 --load_state states/dmn_basic.mh5.n40.babi1.epoch4.test0.00033.state

Server

If you want to start a server which will return the predication for bAbi tasks, you should do the following:

  1. Generate UI files as described in YerevaNN/dmn-ui
  2. Copy the UI files to server/ui
  3. Run the server
cd server && python api.py

If have Docker installed, you can pull our Docker image with ready DMN server.

docker pull yerevann/docker
docker run --name dmn_1 -it --rm -p 5000:5000 yerevann/dmn

Roadmap

  • Mini-batch training (done, 08/02/2016)
  • Web interface (done, 08/23/2016)
  • Visualization of episodic memory module (done, 08/23/2016)
  • Regularization (work in progress, L2 doesn't help at all, dropout and batch normalization help a little)
  • Support for multiple-choice questions (work in progress)
  • Evaluation on more complex datasets
  • Import some ideas from Neural Reasoner

License

The MIT License (MIT) Copyright (c) 2016 YerevaNN

More Repositories

1

mimic3-benchmarks

Python suite to construct benchmark machine learning datasets from the MIMIC-III 💊 clinical database.
Python
771
star
2

Spoken-language-identification

Spoken language identification with deep learning
Python
231
star
3

A-Guide-to-Deep-Learning

📚 A detailed guide to deep learning: http://yerevann.com/a-guide-to-deep-learning/
HTML
215
star
4

R-NET-in-Keras

Open R-NET (hy` առնետ 🐁) implementation and detailed analysis: https://git.io/vd8dx
Python
180
star
5

translit-rnn

Automatic transliteration with LSTM
Python
92
star
6

WARP

Code for ACL'2021 paper WARP 🌀 Word-level Adversarial ReProgramming. Outperforming `GPT-3` on SuperGLUE Few-Shot text classification. https://aclanthology.org/2021.acl-long.381/
Python
83
star
7

DIIN-in-Keras

Reproducing Densely Interactive Inference Network in Keras
Python
75
star
8

neural-colorizer

Convolutional autoencoder to colorize greyscale images
Python
43
star
9

BARTSmiles

BARTSmiles, generative masked language model for molecular representations
Python
27
star
10

BioRelEx

🧬 BioRelEx: Biological Relation Extraction Benchmark @ ACL BioNLP Workshop 2019
Python
19
star
11

dmn-ui

UI for Dynamic Memory Networks
JavaScript
15
star
12

yerevann.github.io

YerevaNN blog
CSS
14
star
13

SciERC

A fork of https://bitbucket.org/luanyi/scierc/src
Python
14
star
14

PARASITE

🪱 PARASITE || A parallel sentence data preprocessing toolkit. Originally developed as a part of the `en-ru` winner submission of WMT20 Biomedical Translation Task.
Python
11
star
15

Relation-extraction-pipeline

Pipelines that combine different modules to perform relation extraction
Python
9
star
16

RaSoR-in-Tensorflow

The implementation of one of the SQuAD solutions
Python
7
star
17

word2vec-armenian-wiki

Testing word2vec on Armenian Wikipedia
C
6
star
18

armtreebank

Armenian Treebank http://armtreebank.yerevann.com/
Python
5
star
19

Caffe-python-tools

Some tools written in Python to work with Caffe
Python
4
star
20

SSL-playground

Python
4
star
21

zsee

Zero Shot Event Extraction - Making pretrained sentence encoders more multilingual and language-agnostic. Works best (at the moment) with YerevaNN's internal version of allennlp.
Python
4
star
22

Kaggle-diabetic-retinopathy-detection

Scripts used in Kaggle Diabetic retionpathy detection contest by YerevaNN team
Mathematica
3
star
23

pmi

Fast pointwise mutual information implementation in C++
C++
3
star
24

RelationClassification

Python
2
star
25

dmn-docker

Dockerfile for starting DMN with UI
2
star
26

hyper-language-identification

Python
2
star
27

amr_seq2seq

Python
2
star
28

dom-gen-failure-modes

Python
1
star
29

char-rnn-constitution

Shell
1
star
30

JointUD

🚬 JointUD - Universal Dependencies | Part-of-Speech tagging, Morphological parsing and Lemmatization
Python
1
star
31

NN-in-Armenian

Presentation and other stuff on Neural networks in Armenian
1
star
32

yarx

YARX - Yet Another Relation eXtraction framework, based on SciIE architecture and AllenNLP framework
Python
1
star
33

Molecular_Generation_with_GDB13

Jupyter Notebook
1
star
34

BioER

Biological entity recognition
Jupyter Notebook
1
star
35

docker-cudnn-theano

Docker image for Theano with Ubuntu 16.04 + CUDA 8.0 + cuDNN 7
Dockerfile
1
star
36

NLOS-Localization-WAIR-D

Python
1
star