• Stars
    star
    395
  • Rank 109,040 (Top 3 %)
  • Language
    Lua
  • License
    BSD 2-Clause "Sim...
  • Created over 9 years ago
  • Updated over 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

an imagenet example in torch.

##Training an Object Classifier in Torch-7 on multiple GPUs over ImageNet

In this concise example (1200 lines including a general-purpose and highly scalable data loader for images), we showcase:

  • train AlexNet or Overfeat, VGG and Googlenet on ImageNet
  • showcase multiple backends: CuDNN, CuNN
  • use nn.DataParallelTable to speedup training over multiple GPUs
  • multithreaded data-loading from disk (showcases sending tensors from one thread to another without serialization)

Requirements

Data processing

The images dont need to be preprocessed or packaged in any database. It is preferred to keep the dataset on an SSD but we have used the data loader comfortably over NFS without loss in speed. We just use a simple convention: SubFolderName == ClassName. So, for example: if you have classes {cat,dog}, cat images go into the folder dataset/cat and dog images go into dataset/dog

The training images for imagenet are already in appropriate subfolders (like n07579787, n07880968). You need to get the validation groundtruth and move the validation images into appropriate subfolders. To do this, download ILSVRC2012_img_train.tar ILSVRC2012_img_val.tar and use the following commands:

# extract train data
mkdir train && mv ILSVRC2012_img_train.tar train/ && cd train
tar -xvf ILSVRC2012_img_train.tar && rm -f ILSVRC2012_img_train.tar
find . -name "*.tar" | while read NAME ; do mkdir -p "${NAME%.tar}"; tar -xvf "${NAME}" -C "${NAME%.tar}"; rm -f "${NAME}"; done
# extract validation data
cd ../ && mkdir val && mv ILSVRC2012_img_val.tar val/ && cd val && tar -xvf ILSVRC2012_img_val.tar
wget -qO- https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh | bash

Now you are all set!

If your imagenet dataset is on HDD or a slow SSD, run this command to resize all the images such that the smaller dimension is 256 and the aspect ratio is intact. This helps with loading the data from disk faster.

find . -name "*.JPEG" | xargs -I {} convert {} -resize "256^>" {}

Running

The training scripts come with several options which can be listed by running the script with the flag --help

th main.lua --help

To run the training, simply run main.lua By default, the script runs 1-GPU AlexNet with the CuDNN backend and 2 data-loader threads.

th main.lua -data [imagenet-folder with train and val folders]

For 2-GPU model parallel AlexNet + CuDNN, you can run it this way:

th main.lua -data [imagenet-folder with train and val folders] -nGPU 2 -backend cudnn -netType alexnet

Similarly, you can switch the backends to 'cunn' to use a different set of CUDA kernels.

You can also alternatively train OverFeat using this following command:

th main.lua -data [imagenet-folder with train and val folders] -netType overfeat

# multi-GPU overfeat (let's say 2-GPU)
th main.lua -data [imagenet-folder with train and val folders] -netType overfeat -nGPU 2

The training script prints the current Top-1 and Top-5 error as well as the objective loss at every mini-batch. We hard-coded a learning rate schedule so that AlexNet converges to an error of 42.5% at the end of 53 epochs.

At the end of every epoch, the model is saved to disk (as model_[xx].t7 where xx is the epoch number). You can reload this model into torch at any time using torch.load

model = torch.load('model_10.t7') -- loading back a saved model

Similarly, if you would like to test your model on a new image, you can use testHook from line 103 in donkey.lua to load your image, and send it through the model for predictions. For example:

dofile('donkey.lua')
img = testHook({loadSize}, 'test.jpg')
model = torch.load('model_10.t7')
if img:dim() == 3 then
  img = img:view(1, img:size(1), img:size(2), img:size(3))
end
predictions = model:forward(img:cuda())

If you ever want to reuse this example, and debug your scripts, it is suggested to debug and develop in the single-threaded mode, so that stack traces are printed fully.

th main.lua -nDonkeys 0 [...options...]

Code Description

  • main.lua (~30 lines) - loads all other files, starts training.
  • opts.lua (~50 lines) - all the command-line options and description
  • data.lua (~60 lines) - contains the logic to create K threads for parallel data-loading.
  • donkey.lua (~200 lines) - contains the data-loading logic and details. It is run by each data-loader thread. random image cropping, generating 10-crops etc. are in here.
  • model.lua (~80 lines) - creates AlexNet model and criterion
  • train.lua (~190 lines) - logic for training the network. we hard-code a learning rate + weight decay schedule that produces good results.
  • test.lua (~120 lines) - logic for testing the network on validation set (including calculating top-1 and top-5 errors)
  • dataset.lua (~430 lines) - a general purpose data loader, mostly derived from here: imagenetloader.torch. That repo has docs and more examples of using this loader.

More Repositories

1

ganhacks

starter from "How to Train a GAN?" at NIPS2016
10,908
star
2

convnet-benchmarks

Easy benchmarking of all publicly accessible implementations of convnets
Python
2,675
star
3

dcgan.torch

A torch implementation of http://arxiv.org/abs/1511.06434
Lua
1,427
star
4

cvpr2015

Jupyter Notebook
869
star
5

cudnn.torch

Torch-7 FFI bindings for NVIDIA CuDNN
Lua
408
star
6

torch-android

Torch-7 for Android
CMake
275
star
7

talks

Jupyter Notebook
261
star
8

net2net.torch

Implementation of http://arxiv.org/abs/1511.05641 that lets one build a larger net starting from a smaller one.
Lua
159
star
9

imagenetloader.torch

some old code that i wrote, might be useful to others
Shell
88
star
10

deepmind-atari

Lua
67
star
11

lua---audio

Module for torch to support audio i/o as well as do common operations like dFFT, generate spectrograms etc.
C
67
star
12

inception.torch

Torch port of https://github.com/google/inception
Jupyter Notebook
66
star
13

torch-signal

Signal processing toolbox for Torch 7
Lua
48
star
14

cuda-convnet2.torch

Torch7 bindings for cuda-convnet2 kernels!
Cuda
40
star
15

matio-ffi.torch

A LuaJIT FFI interface to MATIO and simple bindings for torch
Lua
39
star
16

galaxyzoo

Entry for GalaxyZoo challenge
Lua
35
star
17

eyescream

JavaScript
35
star
18

nextml

35
star
19

examplepackage.torch

A hello-world for torch packages
CMake
23
star
20

sunfish.lua

tiny and basic chess engine for lua. Port of https://github.com/thomasahle/sunfish
Lua
20
star
21

kaggle_retinopathy_starter.torch

A starter kit in Torch for Kaggle Diabetic Retinopathy Detection
Lua
19
star
22

neon.torch

Nervana Neon kernels in Torch
Lua
18
star
23

torch-ship-binaries

A page describing how to ship torch binaries without sharing the source code of your scripts.
17
star
24

nnjs

JavaScript
16
star
25

deep_gitstats

Based on SciPy's normalized git stats, adapted for Deep Learning frameworks
Jupyter Notebook
16
star
26

cifar.torch

Lua
15
star
27

torch.js

nodejs bindings for libTH (tensor library that powers torch). for fun!
JavaScript
14
star
28

fakecuda

A convenient package for the lazy torch programmer to leave all your :cuda() calls as-is when running on CPU
Lua
14
star
29

rgbd_streamer

Python
12
star
30

mscoco.torch

Lua
11
star
31

torch-docker

Dockerfile to create an image for Torch7
Shell
10
star
32

NeuralNetworks.jl

hacking torch-like neural networks in Julia
Julia
10
star
33

torch-cheatsheet

A quick page for everything Torch
9
star
34

fftw3-ffi

A LuaJIT FFI interface to FFTW3
Lua
5
star
35

thnb

iTorch notebooks
4
star
36

lzmqstatic

Self-contained statically linked zeromq bindings for lua
C++
3
star
37

nvblog_rnnlstm

HTML
3
star
38

fairmark1

Lua
2
star
39

cunnsparse

Lua
2
star
40

yasa

Yet another Sentiment analyzer. This one uses convolution networks.
Lua
1
star
41

cunnCUDA

some depreceated, ugly and old modules
Cuda
1
star
42

housenumbers_classifier

An attempt on the Stanford Housenumbers dataset
Lua
1
star
43

Bar__ZEbulLonX22L.torch

wtf
1
star