• Stars
    star
    333
  • Rank 126,599 (Top 3 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 9 years ago
  • Updated almost 8 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Implementation of Dynamic memory networks by Kumar et al. http://arxiv.org/abs/1506.07285

Dynamic memory networks in Theano

The aim of this repository is to implement Dynamic memory networks as described in the paper by Kumar et al. and to experiment with its various extensions.

Pretrained models on bAbI tasks can be tested online.

We will cover the process in a series of blog posts.

Repository contents

file description
main.py the main entry point to train and test available network architectures on bAbI-like tasks
dmn_basic.py our baseline implementation. It is as close to the original as we could understand the paper, except the number of steps in the main memory GRU is fixed. Attention module uses T.abs_ function as a distance between two vectors which causes gradients to become NaN randomly. The results reported in this blog post are based on this network
dmn_smooth.py uses the square of the Euclidean distance instead of abs in the attention module. Training is very stable. Performance on bAbI is slightly better
dmn_batch.py dmn_smooth with minibatch training support. The batch size cannot be set to 1 because of the Theano bug
dmn_qa_draft.py draft version of a DMN designed for answering multiple choice questions
utils.py tools for working with bAbI tasks and GloVe vectors
nn_utils.py helper functions on top of Theano and Lasagne
fetch_babi_data.sh shell script to fetch bAbI tasks (adapted from MemN2N)
fetch_glove_data.sh shell script to fetch GloVe vectors (by 5vision)
server/ contains Flask-based restful api server

Usage

This implementation is based on Theano and Lasagne. One way to install them is:

pip install -r https://raw.githubusercontent.com/Lasagne/Lasagne/master/requirements.txt
pip install https://github.com/Lasagne/Lasagne/archive/master.zip

The following bash scripts will download bAbI tasks and GloVe vectors.

./fetch_babi_data.sh
./fetch_glove_data.sh

Use main.py to train a network:

python main.py --network dmn_basic --babi_id 1

The states of the network will be saved in states/ folder. There is one pretrained state on the 1st bAbI task. It should give 100% accuracy on the test set:

python main.py --network dmn_basic --mode test --babi_id 1 --load_state states/dmn_basic.mh5.n40.babi1.epoch4.test0.00033.state

Server

If you want to start a server which will return the predication for bAbi tasks, you should do the following:

  1. Generate UI files as described in YerevaNN/dmn-ui
  2. Copy the UI files to server/ui
  3. Run the server
cd server && python api.py

If have Docker installed, you can pull our Docker image with ready DMN server.

docker pull yerevann/docker
docker run --name dmn_1 -it --rm -p 5000:5000 yerevann/dmn

Roadmap

  • Mini-batch training (done, 08/02/2016)
  • Web interface (done, 08/23/2016)
  • Visualization of episodic memory module (done, 08/23/2016)
  • Regularization (work in progress, L2 doesn't help at all, dropout and batch normalization help a little)
  • Support for multiple-choice questions (work in progress)
  • Evaluation on more complex datasets
  • Import some ideas from Neural Reasoner

License

The MIT License (MIT) Copyright (c) 2016 YerevaNN

More Repositories

1

mimic3-benchmarks

Python suite to construct benchmark machine learning datasets from the MIMIC-III πŸ’Š clinical database.
Python
799
star
2

Spoken-language-identification

Spoken language identification with deep learning
Python
233
star
3

A-Guide-to-Deep-Learning

πŸ“š A detailed guide to deep learning: http://yerevann.com/a-guide-to-deep-learning/
HTML
217
star
4

R-NET-in-Keras

Open R-NET (hy` Υ‘ΥΌΥΆΥ₯ΥΏ 🐁) implementation and detailed analysis: https://git.io/vd8dx
Python
179
star
5

translit-rnn

Automatic transliteration with LSTM
Python
92
star
6

WARP

Code for ACL'2021 paper WARP πŸŒ€ Word-level Adversarial ReProgramming. Outperforming `GPT-3` on SuperGLUE Few-Shot text classification. https://aclanthology.org/2021.acl-long.381/
Python
83
star
7

DIIN-in-Keras

Reproducing Densely Interactive Inference Network in Keras
Python
74
star
8

neural-colorizer

Convolutional autoencoder to colorize greyscale images
Python
43
star
9

BARTSmiles

BARTSmiles, generative masked language model for molecular representations
Python
30
star
10

ChemLactica

Fine-tuning Galactica and Gemma to operate on SMILES. Integrates into a molecular optimization algorithm.
Jupyter Notebook
20
star
11

BioRelEx

🧬 BioRelEx: Biological Relation Extraction Benchmark @ ACL BioNLP Workshop 2019
Python
19
star
12

dmn-ui

UI for Dynamic Memory Networks
JavaScript
15
star
13

yerevann.github.io

YerevaNN blog
CSS
14
star
14

SciERC

A fork of https://bitbucket.org/luanyi/scierc/src
Python
14
star
15

PARASITE

πŸͺ± PARASITE || A parallel sentence data preprocessing toolkit. Originally developed as a part of the `en-ru` winner submission of WMT20 Biomedical Translation Task.
Python
11
star
16

Relation-extraction-pipeline

Pipelines that combine different modules to perform relation extraction
Python
9
star
17

RaSoR-in-Tensorflow

The implementation of one of the SQuAD solutions
Python
7
star
18

armtreebank

Armenian Treebank http://armtreebank.yerevann.com/
Python
6
star
19

word2vec-armenian-wiki

Testing word2vec on Armenian Wikipedia
C
6
star
20

Caffe-python-tools

Some tools written in Python to work with Caffe
Python
4
star
21

SSL-playground

Python
4
star
22

zsee

Zero Shot Event Extraction - Making pretrained sentence encoders more multilingual and language-agnostic. Works best (at the moment) with YerevaNN's internal version of allennlp.
Python
4
star
23

Molecular_Generation_with_GDB13

Jupyter Notebook
3
star
24

Kaggle-diabetic-retinopathy-detection

Scripts used in Kaggle Diabetic retionpathy detection contest by YerevaNN team
Mathematica
3
star
25

NLOS-Localization-WAIR-D

Python
3
star
26

pmi

Fast pointwise mutual information implementation in C++
C++
3
star
27

RelationClassification

Python
2
star
28

dmn-docker

Dockerfile for starting DMN with UI
2
star
29

hyper-language-identification

Python
2
star
30

amr_seq2seq

Python
2
star
31

dom-gen-failure-modes

Python
1
star
32

char-rnn-constitution

Shell
1
star
33

NN-in-Armenian

Presentation and other stuff on Neural networks in Armenian
1
star
34

JointUD

🚬 JointUD - Universal Dependencies | Part-of-Speech tagging, Morphological parsing and Lemmatization
Python
1
star
35

BioER

Biological entity recognition
Jupyter Notebook
1
star
36

yarx

YARX - Yet Another Relation eXtraction framework, based on SciIE architecture and AllenNLP framework
Python
1
star
37

docker-cudnn-theano

Docker image for Theano with Ubuntu 16.04 + CUDA 8.0 + cuDNN 7
Dockerfile
1
star