jiamings/fast-weights

Stars
172
Rank 219,856 (Top 5 %)
Language
Python
Created almost 8 years ago
Updated almost 8 years ago

jiamings/fast-weights

jiamings

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Implementation of the paper [Using Fast Weights to Attend to the Recent Past](https://arxiv.org/abs/1610.06258)

Using Fast Weights to Attend to the Recent Past

Reproducing the associative model experiment on the paper

Using Fast Weights to Attend to the Recent Past by Jimmy Ba et al. (Incomplete)

Prerequisites

Tensorflow (version >= 0.8)

How to Run the Experiments

Generate a dataset

$ python generator.py

This script generates a file called associative-retrieval.pkl, which can be used for training.

Run the model

$ python fw.py

Findings

The following is the accuracy and loss graph for R=20. The experiments are barely tuned.

Layer Normalization is extremely crucial for the success of training.

Otherwise, training will not converge when the inner step is larger than 1.
Even when inner step of 1, the performance without layer normalization is much worse. For R=20, only 0.4 accuracy can be achieved (which is same as the level of other models.)
Even with Layer Normalization, using slow weights (ie. vanilla RNN) is much worse than using fast weights.

Further improvements:

Complete fine-tuning
Work on other tasks

References

Using Fast Weights to Attend to the Recent Past. Jimmy Ba, Geoffrey Hinton, Volodymyr Mnih, Joel Z. Leibo, Catalin Ionescu.

Layer Normalization. Jimmy Ba, Ryan Kiros, Geoffery Hinton.

wgan

Tensorflow Implementation of Wasserstein GAN (and Improved version in wgan_v2)

cramer-gan

Tensorflow Implementation on "The Cramer Distance as a Solution to Biased Wasserstein Gradients" (https://arxiv.org/pdf/1705.10743.pdf)

d2c

PyTorch implementation of D2C: Diffuison-Decoding Models for Few-shot Conditional Generation.

ml-cpc

Jupyter Notebook

ais

Annealed Importance Sampling (AIS) for generative models.

tsong.me

You can use my code, but I would appreciate it if you provide a link back to my website. Also do not mess up with Google Analytics!

d2c_pre_release

infusion

Reproducing an experiment from ICLR 2017 submission "Learning to Generate Samples from Noise Through Infusion Training"

cv

Curriculum Vitae using LaTeX.

icm2015

Source for our paper for the Interdisciplinary Contest in Modeling, which won the Outstanding Winner award (<1%).

ddrm-exp-datasets

Demo images for DDRM paper

kl_wgan_sim

Jupyter Notebook

scholar-bibtex-keys

Convert bibtex keys to Google scholar style: [first-author-last-name][year][title-first-word]

academic

Too many papers to update!

biopedia

Data center skeleton for the Tsinghua Bioinformatics Group. The website is currently maintained by someone else.

dsp-srt

tusk

Tsinghua University Serach Kit

soahomework1

Tiny website using Weibo's OAuth2 API. Experiment mode. Only limited Weibo users are allowed.