• Stars
    star
    182
  • Rank 211,154 (Top 5 %)
  • Language
    Jupyter Notebook
  • Created over 7 years ago
  • Updated about 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A machine learning experiment

Knowledge distillation with Keras

Keras implementation of Hinton's knowledge distillation (KD), a way of transferring knowledge from a large model into a smaller model.

Summary

  • I use Caltech-256 dataset for a demonstration of the technique.
  • I transfer knowledge from Xception to MobileNet-0.25 and SqueezeNet v1.1.
  • Results:
model accuracy, % top 5 accuracy, % logloss
Xception 82.3 94.7 0.705
MobileNet-0.25 64.6 85.9 1.455
MobileNet-0.25 with KD 66.2 86.7 1.464
SqueezeNet v1.1 67.2 86.5 1.555
SqueezeNet v1.1 with KD 68.9 87.4 1.297

Implementation details

  • I use pretrained on ImageNet models.
  • For validation I use 20 images from each category.
  • For training I use 100 images from each category.
  • I use random crops and color augmentation to balance the dataset.
  • I resize all images to 299x299.
  • In all models I train the last two layers.

Notes on flow_from_directory

I use three slightly different versions of Keras' ImageDataGenerator.flow_from_directory:

  • original version for initial training of Xception and MobileNet.
  • ver1 for getting logits from Xception. Now DirectoryIterator.next also outputs image names.
  • ver2 for knowledge transfer. Here DirectoryIterator.next packs logits with hard true targets. All three versions only differ in DirectoryIterator.next function.

Requirements

  • Python 3.5
  • Keras 2.0.6
  • torchvision, Pillow
  • numpy, pandas, tqdm

References

[1] Geoffrey Hinton, Oriol Vinyals, Jeff Dean, Distilling the Knowledge in a Neural Network

More Repositories

1

mtcnn-pytorch

Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks
Jupyter Notebook
648
star
2

FaceBoxes-tensorflow

A fast face detector
Python
178
star
3

shufflenet-v2-tensorflow

A lightweight convolutional neural network
Jupyter Notebook
152
star
4

lda2vec-pytorch

Topic modeling with word vectors
Jupyter Notebook
115
star
5

trained-ternary-quantization

Reducing the size of convolutional neural networks
Jupyter Notebook
107
star
6

image-classification-caltech-256

Exploring CNNs and model quantization on Caltech-256 dataset
Jupyter Notebook
83
star
7

wing-loss

A facial landmarks regressor
Jupyter Notebook
71
star
8

ShuffleNet-tensorflow

A ShuffleNet implementation tested on Tiny ImageNet dataset
Jupyter Notebook
41
star
9

light-head-rcnn

Python
23
star
10

set-transformer

A neural network architecture for prediction on sets
Python
21
star
11

single-shot-detector

A lightweight version of RetinaNet
Python
16
star
12

MultiPoseNet

Python
9
star
13

associative-domain-adaptation

A simple domain adaptation example
Python
8
star
14

multi-scale-gradient-gan

Generation of high resolution fashion images
Python
7
star
15

WESPE

Manipulating image quality using GANs
Python
6
star
16

bicycle-gan

Multimodal edges to image translation
Python
5
star
17

point-cloud-autoencoder

Python
5
star
18

contextual-loss

Jupyter Notebook
3
star
19

CNNMRF

Jupyter Notebook
3
star
20

universal-style-transfer

Python
2
star
21

EDANet

Python
2
star
22

U-GAT-IT

Unsupervised Image-to-Image Translation
Python
2
star
23

maxout-networks-tensorflow

A neural network with maxout activation units
Python
1
star
24

large-shufflenet-tpu

Python
1
star