• Stars
    star
    281
  • Rank 147,023 (Top 3 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 7 years ago
  • Updated over 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

TensorFlow implementation of GoogLeNet and Inception for image classification.

GoogLeNet for Image Classification

  • This repository contains the examples of natural image classification using pre-trained model as well as training a Inception network from scratch on CIFAR-10 dataset (93.64% accuracy on testing set). The pre-trained model on CIFAR-10 can be download from here.
  • Architecture of GoogLeNet from the paper: googlenet

Requirements

Implementation Details

For testing the pre-trained model

  • Images are rescaled so that the smallest side equals 224 before fed into the model. This is not the same as the original paper which is an ensemble of 7 similar models using 144 224x224 crops per image for testing. So the performance will not be as good as the original paper.

For training from scratch on CIFAR-10

  • All the LRN layers are removed from the convolutional layers.
  • Batch normalization and ReLU activation are used in all the convolutional layers including the layers in Inception structure except the output layer.
  • Two auxiliary classifiers are used as mentioned in the paper, though 512 instead of 1024 hidden units are used in the two fully connected layers to reduce the computation. However, I found the results are almost the same on CIFAR-10 with and without auxiliary classifiers.
  • Since the 32 x 32 images are down-sampled to 1 x 1 before fed into inception_5a, this makes the multi-scale structure of inception layers less useful and harm the performance (around 80% accuracy). To make full use of the multi-scale structures, the stride of the first convolutional layer is reduced to 1 and the first two max pooling layers are removed. The the feature map (32 x 32 x channels) will have almost the same size as described in table 1 (28 x 28 x channel) in the paper before fed into inception_3a. I have also tried only reduce the stride or only remove one max pooling layer. But I found the current setting provides the best performance on the testing set.
  • During training, dropout with keep probability 0.4 is applied to two fully connected layers and weight decay with 5e-4 is used as well.
  • The network is trained through Adam optimizer. Batch size is 128. The initial learning rate is 1e-3, decays to 1e-4 after 30 epochs, and finally decays to 1e-5 after 50 epochs.
  • Each color channel of the input images are subtracted by the mean value computed from the training set.

Usage

ImageNet Classification

Preparation

  • Download the pre-trained parameters here. This is original from here.
  • Setup path in examples/inception_pretrained.py: PRETRINED_PATH is the path for pre-trained model. DATA_PATH is the path to put testing images.

Run

Go to examples/ and put test image in folder DATA_PATH, then run the script:

python inception_pretrained.py --im_name PART_OF_IMAGE_NAME
  • --im_name is the option for image names you want to test. If the testing images are all png files, this can be png. The default setting is .jpg.
  • The output will be the top-5 class labels and probabilities.

Train the network on CIFAR-10

Preparation

  • Download CIFAR-10 dataset from here
  • Setup path in examples/inception_cifar.py: DATA_PATH is the path to put CIFAR-10. SAVE_PATH is the path to save or load summary file and trained model.

Train the model

Go to examples/ and run the script:

python inception_cifar.py --train \
  --lr LEARNING_RATE \
  --bsize BATCH_SIZE \
  --keep_prob KEEP_PROB_OF_DROPOUT \
  --maxepoch MAX_TRAINING_EPOCH
  • Summary and model will be saved in SAVE_PATH. One pre-trained model on CIFAR-10 can be downloaded from here.

Evaluate the model

Go to examples/ and put the pre-trained model in SAVE_PATH. Then run the script:

python inception_cifar.py --eval \
  --load PRE_TRAINED_MODEL_ID
  • The pre-trained ID is epoch ID shown in the save modeled file name. The default value is 99, which indicates the one I uploaded.
  • The output will be the accuracy of training and testing set.

Results

Image classification using pre-trained model

  • Top five predictions are shown. The probabilities are shown keeping two decimal places. Note that the pre-trained model are trained on ImageNet.
  • Result of VGG19 for the same images can be found here. The pre-processing of images for both experiments are the same.
Data Source Image Result
COCO 1: probability: 1.00, label: brown bear, bruin, Ursus arctos
2: probability: 0.00, label: ice bear, polar bear
3: probability: 0.00, label: hyena, hyaena
4: probability: 0.00, label: chow, chow chow
5: probability: 0.00, label: American black bear, black bear
COCO 1: probability: 0.79, label: street sign
2: probability: 0.06, label: traffic light, traffic signal, stoplight
3: probability: 0.03, label: parking meter
4: probability: 0.02, label: mailbox, letter box
5: probability: 0.01, label: balloon
COCO 1: probability: 0.94, label: trolleybus, trolley coach
2: probability: 0.05, label: passenger car, coach, carriage
3: probability: 0.00, label: fire engine, fire truck
4: probability: 0.00, label: streetcar, tram, tramcar, trolley
5: probability: 0.00, label: minibus
COCO 1: probability: 0.35, label: burrito
2: probability: 0.17, label: potpie
3: probability: 0.14, label: mashed potato
4: probability: 0.10, label: plate
5: probability: 0.03, label: pizza, pizza pie
ImageNet 1: probability: 1.00, label: goldfish, Carassius auratus
2: probability: 0.00, label: rock beauty, Holocanthus tricolor
3: probability: 0.00, label: puffer, pufferfish, blowfish, globefish
4: probability: 0.00, label: tench, Tinca tinca
5: probability: 0.00, label: anemone fish
Self Collection 1: probability: 0.32, label: Egyptian cat
2: probability: 0.30, label: tabby, tabby cat
3: probability: 0.05, label: tiger cat
4: probability: 0.02, label: mouse, computer mouse
5: probability: 0.02, label: paper towel
Self Collection 1: probability: 1.00, label: streetcar, tram, tramcar, trolley, trolley car
2: probability: 0.00, label: passenger car, coach, carriage
3: probability: 0.00, label: trolleybus, trolley coach, trackless trolley
4: probability: 0.00, label: electric locomotive
5: probability: 0.00, label: freight car

Train the network from scratch on CIFAR-10

  • Here is a similar experiment using VGG19.

learning curve for training set

train_lc

learning curve for testing set

  • The accuracy on testing set is 93.64% around 100 epochs. We can observe the slightly over-fitting behavior at the end of training.

valid_lc

Author

Qian Ge

More Repositories

1

adversarial-autoencoders

Tensorflow implementation of Adversarial Autoencoders
Python
247
star
2

CNN-Visualization

TensorFlow implementations of visualization of convolutional neural networks, such as Grad-Class Activation Mapping and guided back propagation
Python
197
star
3

recurrent-attention-model

A TensorFlow implementation of the recurrent models of visual attention
Python
20
star
4

VGG-cifar

A TensorFlow implementation of VGG networks for image classification
Python
12
star
5

DeepVision-tensorflow

A deep learning package for computer vision algorithms built on top of TensorFlow
Python
11
star
6

triplet-loss

Triplet Loss for Person Re-Identification
Python
11
star
7

tf-gans

Tensorflow implementation of Generative Adversarial Networks
Python
7
star
8

neural-style

TensorFlow implementations of art style transfer, such as Neural Style.
Python
6
star
9

yolov3

TensorFlow implementation of YOLOv3
Python
6
star
10

pix2pix

Tensorflow implementation of Image-to-Image Translation with Conditional Adversarial Networks
Python
6
star
11

DRAW-recurrent-image-generation

TensorFlow implementation of Deep Recurrent Attentive Writer (DRAW)
Python
5
star
12

tensorflow-DCGAN

A TensorFlow implementation of "Deep Convolutional Generative Adversarial Networks"
Python
4
star
13

fast-style-transfer

TensorFlow implementation of fast style transfer
Python
3
star
14

tensorflow-FCN

Python
2
star
15

learning-CNN-RNN

Useful links for learning CNN and RNN.
1
star
16

separate-text-image

1
star
17

Deep_Image_Segmentation_TensorFlow

Implementations of learning based image segmentation algorithms
1
star
18

practice-experiment

Python
1
star
19

construct-deep-rnn

TensorFlow implementations of deep recurrent neural networks
Python
1
star
20

cv-generator

TeX
1
star