• Stars
    star
    205
  • Rank 191,264 (Top 4 %)
  • Language
    Python
  • License
    MIT License
  • Created about 7 years ago
  • Updated about 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Implementation of Dynamic Routing Between Capsules, Sara Sabour, Nicholas Frosst, Geoffrey E Hinton, NIPS 2017

Dynamic Routing Between Capsules

Chainer implementation of CapsNet for MNIST.

For the detail, see Dynamic Routing Between Capsules, Sara Sabour, Nicholas Frosst, Geoffrey E Hinton, NIPS 2017.

python -u train.py -g 0 --save saved_model --reconstruct

Test accuracy of a trained model (without reconstruction) reached 99.60%. The paper does not provide detailed information about initialization and optimization, so the performance might not reach that in the paper. For alleviating those issues, I replaced relu with leaky relu with a very small slope (0.05). The modified model achieved 99.66% (i.e. error rate is 0.34%), as the paper reported.

Visualization through Reconstruction

python visualize.py -g 0 --load saved_model

produces some images for analyzing digit capsules.

Different masks

vis_all.png

The top green images are real images which are given to the model. Blue images in i-th represents reconstructed ones of digit "i".

If an correct digit is selected as a target, the model reconstructs an image well (see the diagonal cells).

If an irrelevant target is selected, the reconstructed image gets spoiled (see "0" and the others in the column leftmost), maybe because of lack of information in its digit capsule. However, reconstruction toward a relevant target is not always spoiled, even if a target is not correct (see "8" and "9" the column rightmost).

Interpolation of values in digit capsules

Here, I show reconstructed images after linearly tweaking the value in a dimension in the capsule (as well as section 5.1 and figure 4 in the paper). Green images in the center are reconstructed images without perturbation. Note that a dimension has a different factor if the digit capsule differs, because each matrix for reconstructing each digit is unshared.

You can find and enjoy some factors of variation.

vis_tweaked0.png

vis_tweaked1.png

vis_tweaked2.png

vis_tweaked3.png

vis_tweaked4.png

vis_tweaked5.png

vis_tweaked6.png

vis_tweaked7.png

vis_tweaked8.png

vis_tweaked9.png

More Repositories

1

bookcorpus

Crawl BookCorpus
Python
804
star
2

attention_is_all_you_need

Transformer of "Attention Is All You Need" (Vaswani et al. 2017) by Chainer.
Jupyter Notebook
313
star
3

bert-chainer

Chainer implementation of "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"
Python
220
star
4

convolutional_seq2seq

fairseq: Convolutional Sequence to Sequence Learning (Gehring et al. 2017) by Chainer
Python
65
star
5

arxiv_leaks

Whisper of the arxiv: read comments in tex of papers
Python
31
star
6

chainer-openai-transformer-lm

A Chainer implementation of OpenAI's finetuned transformer language model with a script to import the weights pre-trained by OpenAI
Python
28
star
7

der-network

Dynamic Entity Representation (Kobayashi et al., 2016)
Python
21
star
8

variational_dropout_sparsifies_dnn

Variational Dropout Sparsifies Deep Neural Networks (Molchanov et al. 2017) by Chainer
Python
19
star
9

captioning_chainer

A fast implementation of Neural Image Caption by Chainer
Python
16
star
10

efficient_softmax

BlackOut and Adaptive Softmax for language models by Chainer
Python
11
star
11

ROCStory_skipthought_baseline

A novel baseline model for Story Cloze Test and ROCStories
Python
11
star
12

dynamic_neural_text_model

A Neural Language Model for Dynamically Representing the Meanings of Unknown Words and Entities in a Discourse, Sosuke Kobayashi, Naoaki Okazaki, Kentaro Inui, IJCNLP 2017
9
star
13

interval-bound-propagation-chainer

Sven Gowal et al., Scalable Verified Training for Provably Robust Image Classification, ICCV 2019
Jupyter Notebook
8
star
14

turnover_dropout

Python
7
star
15

learning_to_learn

Learning to learn by gradient descent by gradient descent, Andrychowicz et al., NIPS 2016
Python
7
star
16

decode_from_mask

Generate a sentence from a masked sentence
Python
6
star
17

weight_normalization

Weight Normalization (Salimans and Kingma, 2016) by Chainer
Python
6
star
18

SDCGAN

Sentence generation by DCGAN
Python
5
star
19

elmo-chainer

Chainer implementation of contextualized word representations from bi-directional language models. Copied into https://github.com/chainer/models/tree/master/elmo-chainer
Python
5
star
20

emergence_of_language_using_discrete_sequences

Emergence of Language Using Discrete Sequences
Jupyter Notebook
4
star
21

skip_thought

Language Model and Skip-Thought Vectors (Kiros et al. 2015)
Python
3
star
22

vqvae_chainer

Chainer's Neural Discrete Representation Learning (Aaron van den Oord et al., 2017)
Python
3
star
23

twitter_conversation_crawler

For crawling conversational tweet threads; e.g. datasets for chatbots.
Python
2
star
24

sru_language_model

Language modeling experiments of SRU and variants
Python
2
star
25

rnnlm_chainer

A Fast RNN Language Model by Chainer
Python
2
star