Language Modeling with Gated Convolutional Networks
This is a Tensorflow implementation of Facebook AI Research Lab's paper: Language Modeling with Gated Convolutional Networks. This paper applies a convolutional approach to language modelling with a novel Gated-CNN model.
Architecture
Requirements
-
Download and extract the Google 1 Billion Word dataset in the
data
folder.
Usage
To train the model using the default hyperparameters:
$ python main.py
$ tensorboard --logdir=logs --host=0.0.0.0
Check main.py
for tunable hyperparameter flags.
TODO
- Replace NCE loss with Adaptive Softmax.
- Remove restricted training on fixed sized sentences (20, for now) and extend to account for all varied sentence lenghts.
- Implement Weight Normalisation for faster convergence.
- Train extensively on deeper models to match the results with the paper.