• Stars
    star
    172
  • Rank 221,201 (Top 5 %)
  • Language
    Python
  • Created about 8 years ago
  • Updated about 8 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Implementation of the paper [Using Fast Weights to Attend to the Recent Past](https://arxiv.org/abs/1610.06258)

Using Fast Weights to Attend to the Recent Past

Reproducing the associative model experiment on the paper

Using Fast Weights to Attend to the Recent Past by Jimmy Ba et al. (Incomplete)

Prerequisites

Tensorflow (version >= 0.8)

How to Run the Experiments

Generate a dataset

$ python generator.py

This script generates a file called associative-retrieval.pkl, which can be used for training.

Run the model

$ python fw.py

Findings

The following is the accuracy and loss graph for R=20. The experiments are barely tuned.

Layer Normalization is extremely crucial for the success of training.

  • Otherwise, training will not converge when the inner step is larger than 1.
  • Even when inner step of 1, the performance without layer normalization is much worse. For R=20, only 0.4 accuracy can be achieved (which is same as the level of other models.)
  • Even with Layer Normalization, using slow weights (ie. vanilla RNN) is much worse than using fast weights.

Further improvements:

  • Complete fine-tuning
  • Work on other tasks

References

Using Fast Weights to Attend to the Recent Past. Jimmy Ba,  Geoffrey Hinton, Volodymyr Mnih, Joel Z. Leibo, Catalin Ionescu.

Layer Normalization. Jimmy Ba, Ryan Kiros, Geoffery Hinton.

More Repositories

1

wgan

Tensorflow Implementation of Wasserstein GAN (and Improved version in wgan_v2)
Python
238
star
2

cramer-gan

Tensorflow Implementation on "The Cramer Distance as a Solution to Biased Wasserstein Gradients" (https://arxiv.org/pdf/1705.10743.pdf)
Python
125
star
3

d2c

PyTorch implementation of D2C: Diffuison-Decoding Models for Few-shot Conditional Generation.
Python
121
star
4

ml-cpc

Jupyter Notebook
36
star
5

ais

Annealed Importance Sampling (AIS) for generative models.
Python
16
star
6

tsong.me

You can use my code, but I would appreciate it if you provide a link back to my website. Also do not mess up with Google Analytics!
HTML
13
star
7

d2c_pre_release

Python
10
star
8

infusion

Reproducing an experiment from ICLR 2017 submission "Learning to Generate Samples from Noise Through Infusion Training"
Python
7
star
9

cv

Curriculum Vitae using LaTeX.
TeX
5
star
10

icm2015

Source for our paper for the Interdisciplinary Contest in Modeling, which won the Outstanding Winner award (<1%).
TeX
4
star
11

ddrm-exp-datasets

Demo images for DDRM paper
4
star
12

kl_wgan_sim

Jupyter Notebook
4
star
13

scholar-bibtex-keys

Convert bibtex keys to Google scholar style: [first-author-last-name][year][title-first-word]
TeX
3
star
14

academic

Too many papers to update!
TeX
3
star
15

biopedia

Data center skeleton for the Tsinghua Bioinformatics Group. The website is currently maintained by someone else.
CSS
3
star
16

tiny-gpt2

2
star
17

dsp-srt

TeX
1
star
18

tusk

Tsinghua University Serach Kit
CSS
1
star
19

soahomework1

Tiny website using Weibo's OAuth2 API. Experiment mode. Only limited Weibo users are allowed.
Python
1
star