• Stars
    star
    1,297
  • Rank 36,273 (Top 0.8 %)
  • Language
    Lua
  • License
    BSD 2-Clause "Sim...
  • Created over 8 years ago
  • Updated about 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

3.8% and 18.3% on CIFAR-10 and CIFAR-100

Wide Residual Networks

This code was used for experiments with Wide Residual Networks (BMVC 2016) http://arxiv.org/abs/1605.07146 by Sergey Zagoruyko and Nikos Komodakis.

Deep residual networks were shown to be able to scale up to thousands of layers and still have improving performance. However, each fraction of a percent of improved accuracy costs nearly doubling the number of layers, and so training very deep residual networks has a problem of diminishing feature reuse, which makes these networks very slow to train.

To tackle these problems, in this work we conduct a detailed experimental study on the architecture of ResNet blocks, based on which we propose a novel architecture where we decrease depth and increase width of residual networks. We call the resulting network structures wide residual networks (WRNs) and show that these are far superior over their commonly used thin and very deep counterparts.

For example, we demonstrate that even a simple 16-layer-deep wide residual network outperforms in accuracy and efficiency all previous deep residual networks, including thousand-layer-deep networks. We further show that WRNs achieve incredibly good results (e.g., achieving new state-of-the-art results on CIFAR-10, CIFAR-100, SVHN, COCO and substantial improvements on ImageNet) and train several times faster than pre-activation ResNets.

Update (August 2019): Pretrained ImageNet WRN models are available in torchvision 0.4 and PyTorch Hub, e.g. loading WRN-50-2:

model = torch.hub.load('pytorch/vision', 'wide_resnet50_2', pretrained=True)

Update (November 2016): We updated the paper with ImageNet, COCO and meanstd preprocessing CIFAR results. If you're comparing your method against WRN, please report correct preprocessing numbers because they give substantially different results.

tldr; ImageNet WRN-50-2-bottleneck (ResNet-50 with wider inner bottleneck 3x3 convolution) is significantly faster than ResNet-152 and has better accuracy; on CIFAR meanstd preprocessing (as in fb.resnet.torch) gives better results than ZCA whitening; on COCO wide ResNet with 34 layers outperforms even Inception-v4-based Fast-RCNN model in single model performance.

Test error (%, flip/translation augmentation, meanstd normalization, median of 5 runs) on CIFAR:

Network CIFAR-10 CIFAR-100
pre-ResNet-164 5.46 24.33
pre-ResNet-1001 4.92 22.71
WRN-28-10 4.00 19.25
WRN-28-10-dropout 3.89 18.85

Single-time runs (meanstd normalization):

Dataset network test perf.
CIFAR-10 WRN-40-10-dropout 3.8%
CIFAR-100 WRN-40-10-dropout 18.3%
SVHN WRN-16-8-dropout 1.54%
ImageNet (single crop) WRN-50-2-bottleneck 21.9% top-1, 5.79% top-5
COCO-val5k (single model) WRN-34-2 36 mAP

See http://arxiv.org/abs/1605.07146 for details.

bibtex:

@INPROCEEDINGS{Zagoruyko2016WRN,
    author = {Sergey Zagoruyko and Nikos Komodakis},
    title = {Wide Residual Networks},
    booktitle = {BMVC},
    year = {2016}}

Pretrained models

ImageNet

WRN-50-2-bottleneck (wider bottleneck), see pretrained for details
Download (263MB): https://yadi.sk/d/-8AWymOPyVZns

There are also PyTorch and Tensorflow model definitions with pretrained weights at https://github.com/szagoruyko/functional-zoo/blob/master/wide-resnet-50-2-export.ipynb

COCO

Coming

Installation

The code depends on Torch http://torch.ch. Follow instructions here and run:

luarocks install torchnet
luarocks install optnet
luarocks install iterm

For visualizing training curves we used ipython notebook with pandas and bokeh.

Usage

Dataset support

The code supports loading simple datasets in torch format. We provide the following:

To whiten CIFAR-10 and CIFAR-100 we used the following scripts https://github.com/lisa-lab/pylearn2/blob/master/pylearn2/scripts/datasets/make_cifar10_gcn_whitened.py and then converted to torch using https://gist.github.com/szagoruyko/ad2977e4b8dceb64c68ea07f6abf397b and npy to torch converter https://github.com/htwaijry/npy4th.

We are running ImageNet experiments and will update the paper and this repo soon.

Training

We provide several scripts for reproducing results in the paper. Below are several examples.

model=wide-resnet widen_factor=4 depth=40 ./scripts/train_cifar.sh

This will train WRN-40-4 on CIFAR-10 whitened (supposed to be in datasets folder). This network achieves about the same accuracy as ResNet-1001 and trains in 6 hours on a single Titan X. Log is saved to logs/wide-resnet_$RANDOM$RANDOM folder with json entries for each epoch and can be visualized with itorch/ipython later.

For reference we provide logs for this experiment and ipython notebook to visualize the results. After running it you should see these training curves:

viz

Another example:

model=wide-resnet widen_factor=10 depth=28 dropout=0.3 dataset=./datasets/cifar100_whitened.t7 ./scripts/train_cifar.sh

This network achieves 20.0% error on CIFAR-100 in about a day on a single Titan X.

Multi-GPU is supported with nGPU=n parameter.

Other models

Additional models in this repo:

Implementation details

The code evolved from https://github.com/szagoruyko/cifar.torch. To reduce memory usage we use @fmassa's optimize-net, which automatically shares output and gradient tensors between modules. This keeps memory usage below 4 Gb even for our best networks. Also, it can generate network graph plots as the one for WRN-16-2 in the end of this page.

Acknowledgements

We thank startup company VisionLabs and Eugenio Culurciello for giving us access to their clusters, without them ImageNet experiments wouldn't be possible. We also thank Adam Lerer and Sam Gross for helpful discussions. Work supported by EC project FP7-ICT-611145 ROBOSPECT.

More Repositories

1

pytorchviz

A small package to create visualizations of PyTorch execution graphs
Jupyter Notebook
3,180
star
2

attention-transfer

Improving Convolutional Networks via Attention Transfer (ICLR 2017)
Jupyter Notebook
1,439
star
3

diracnets

Training Very Deep Neural Networks Without Skip-Connections
Jupyter Notebook
586
star
4

functional-zoo

PyTorch and Tensorflow functional model definitions
Jupyter Notebook
586
star
5

loadcaffe

Load Caffe networks in Torch7
Protocol Buffer
494
star
6

cvpr15deepcompare

Code and models for "Learning to Compare Image Patches via Convolutional Neural Networks"
C++
467
star
7

pyinn

CuPy fused PyTorch neural networks ops
Python
274
star
8

cifar.torch

92.45% on CIFAR-10 in Torch
Lua
174
star
9

torch-opencv-demos

Torch7+OpenCV+ConvNets
Lua
167
star
10

binary-wide-resnet

PyTorch implementation of Wide Residual Networks with 1-bit weights by McDonnell (ICLR 2018)
Python
124
star
11

imagine-nn

IMAGINE torch neural network routines
Lua
109
star
12

torch-caffe-binding

Use Caffe in Torch7
C++
64
star
13

imagenet-validation.torch

Fast and easy testing of imagenet models
Lua
49
star
14

neural-style-autograd

autograd version of https://github.com/jcjohnson/neural-style
Lua
44
star
15

cunnproduction

easy embeddable Torch7 networks
C++
35
star
16

nnpack.torch

Torch FFI-bindings for NNPACK
Lua
30
star
17

iterm.torch

Display images directly in iTerm2
Lua
28
star
18

openai-gemm.pytorch

PyTorch bindings for openai-gemm
Python
20
star
19

fastrcnn-models.torch

Fast-RCNN models in Torch-7 format
18
star
20

cutorch-rtc

lua apply function for cutorch
Lua
17
star
21

idiap-tutorials

Jupyter Notebook
16
star
22

functional-style-transfer

minimal implementation of style transfer
Jupyter Notebook
10
star
23

nvrtc.torch

Torch7 bindings for CUDA NVRTC (runtime compilation) library
Lua
9
star
24

imi-demos

live convolutional neural networks demos
Python
9
star
25

cunn-rtc

Runtime compiled Torch cunn modules
Lua
8
star
26

clipp.torch

Torch interface to OpenCLIPP
C++
6
star
27

examples

Python
5
star
28

libclsvm

OpenCL optimized SVM library
C++
2
star
29

infimnist.torch

Torch7 InfiMNIST ffi binding
C
1
star