• Stars
    star
    247
  • Rank 164,117 (Top 4 %)
  • Language
    Jupyter Notebook
  • License
    GNU General Publi...
  • Created about 5 years ago
  • Updated about 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

How to build RNNs and LSTMs from scratch with NumPy.

How to build RNNs and LSTMs from scratch with NumPy

[Update 08/18/2020] Improvements to dataset; exercises and descriptions have been made more clear.

Originally developed by me (Nicklas Hansen), Peter E. Christensen and Alexander R. Johansen as educational material for the graduate deep learning course at the Technical University of Denmark (DTU). You can access the full course material here. Inspired by the great Andrej Karpathy.


In this lab we will introduce different ways of learning from sequential data. As an example, we will train a neural network to do language modelling, i.e. predict the next token in a sentence. In the context of natural language processing a token could be a character or a word, but mind you that the concepts introduced here apply to all kinds of sequential data, such as e.g. protein sequences, weather measurements, audio signals or monetary transaction history, just to name a few.

To really get a grasp of what is going on inside the recurrent neural networks that we are about to teach you, we will carry out a substantial part of this exercise in NumPy rather than PyTorch. Once you get a hold of it, we will proceed to the PyTorch implementation.

In this notebook we will show you:

  • How to represent categorical variables in networks
  • How to build a recurrent neural network (RNN) from scratch
  • How to build a LSTM network from scratch
  • How to build a LSTM network in PyTorch

Dataset

For this exercise we will create a simple dataset that we can learn from. We generate sequences of the form:

a b EOS,

a a b b EOS,

a a a a a b b b b b EOS

where EOS is a special character denoting the end of a sequence. The task is to predict the next token t_n, i.e. a, b, EOS or the unknown token UNK given the sequence of tokens t_1, t_2, ..., t_n-1 and we are to process sequences in a sequential manner. As such, the network will need to learn that e.g. 5 bs and an EOS token will follow 5 as.

Results

The RNN takes considerable effort to converge to a nice solution:

RNN loss

The LSTM learns much faster than the RNN:

LSTM loss

And finally, the PyTorch LSTM learns even faster and converges to a better local minimum:

PyTorch LSTM loss

After working your way through these exercises, you should have a better understanding of how RNNs work, how to train them, and what they can be used for. And the conclusion? - use PyTorch.

More Repositories

1

tdmpc

Code for "Temporal Difference Learning for Model Predictive Control"
Python
352
star
2

tdmpc2

Code for "TD-MPC2: Scalable, Robust World Models for Continuous Control"
Python
327
star
3

voice-activity-detection

Voice Activity Detection (VAD) using deep learning.
Jupyter Notebook
190
star
4

dmcontrol-generalization-benchmark

DMControl Generalization Benchmark
Python
165
star
5

puppeteer

Code for "Hierarchical World Models as Visual Whole-Body Humanoid Controllers"
Python
140
star
6

policy-adaptation-during-deployment

Training code and evaluation benchmarks for the "Self-Supervised Policy Adaptation during Deployment" paper.
Python
111
star
7

neural-net-optimization

PyTorch implementations of recent optimization algorithms for deep learning.
Python
61
star
8

minimal-nas

Minimal implementation of a Neural Architecture Search system.
Python
36
star
9

svea-vit

Code for the paper "Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under Data Augmentation"
Python
17
star
10

adaptive-learning-rate-schedule

PyTorch implementation of the "Learning an Adaptive Learning Rate Schedule" paper found here: https://arxiv.org/abs/1909.09712.
Python
10
star
11

nicklashansen.github.io

Repository for my personal site https://nicklashansen.github.io/, built with plain html.
HTML
9
star
12

a3c

Asynchronous Advantage Actor-Critic using Generalized Advantage Estimation (PyTorch)
Python
8
star
13

smallrl

Personal repository for quick RL prototyping. Work in progress!
Python
3
star
14

docker-from-conda

Builds a docker image from a conda environment.yml file.
Dockerfile
3
star
15

music-genre-classification

Exam project on Audio Features for Music Genre Classification for course 02452 Audio Information Processing Systems at Technical University of Denmark (DTU).
Jupyter Notebook
1
star
16

bachelor-thesis

Repository for bachelor thesis on Automatic Multi-Modal Detection of Autonomic Arousals in Sleep. The thesis itself and all related data is confidential and thus not publicly available, but access to the thesis can be granted by sending a request to [email protected].
Python
1
star
17

reinforcement-learning-sutton-barto

Personal repository for course on reinforcement learning. Includes implementations of various problems from the Reinforcement Learning: An Introduction book by R. Sutton and A. Barto.
Jupyter Notebook
1
star
18

nautilus-launcher

Minimal launcher for Nautilus
Python
1
star