Multi-Resolution CNNs for Large-Scale Scene Recognition
Here we provide the code and models for the following paper (Arxiv Preprint):
Knowledge Guided Disambiguation for Large-Scale Scene Classification with Multi-Resolution CNNs
Limin Wang, Sheng Guo, Weilin Huang, Yuanjun Xiong, and Yu Qiao
in IEEE Transactions on Image Processing, 2017
Updates
- February 21st, 2017
- Release the code and models
- January 3rd, 2017
- Initialize the repo
Overview
We have made two efforts to exploit CNNs for large-scale scene recognition:
- We design a modular framework to capture multi-level visual information for scene understanding by training CNNs from different resolutions
- We propose a knowledge disambiguation strategy by using soft labels from extra networks to deal with the label ambiguity issue of scene recognition.
These two efforts are the core part of team "SIAT_MMLAB" for the following large-scale scene recogntion challenges.
Challenge | Rank | Performance |
---|---|---|
Places2 challenge 2015 | 2nd place | 0.1736 top5-error |
Places2 challenge 2016 | 4th place | 0.1042 top5-error |
LSUN challenge 2015 | 2nd place | 0.9030 top1-accuracy |
LSUN challenge 2016 | 1st place | 0.9161 top1-accuracy |
Places365 Models
We first release the learned models on the Places365 dataset.
- Models learned at resolution of 256 * 256
Model | Top5 Error Rate |
---|---|
(A0) Normal BN-Inception | 0.143 |
(A1) Normal BN-Inception + object networks | 0.141 |
(A2) Normal BN-Inception + scene networks | 0.134 |
- Models learned at resolution of 384 * 384
Model | Top5 Error Rate |
---|---|
(B0) Deeper BN-Inception | 0.140 |
(B1) Deeper BN-Inception + object networks | 0.136 |
(B2) Deeper BN-Inception + scene networks | 0.130 |
- Download initialization and reference models
We release the scripts at the directory of scripts/
.
Try bash scripts/get_init_models.sh
to downdload knowldege models.
Try bash scripts/get_reference_models.sh
to download reference models.
Testing Code
We release the testing code on the Places365 validation dataset at the directory of matlab/
.
We also release a demo code to use our Places365 model as generic feature extraction and perform scene recognition on the MIT Indoor67 dataset at the directory of matlab/
.
Training Code
We release the models at the directory of models/
and the training scripts at the directory of scripts/
.
Try bash scripts/256_inception2_train.sh
to train standard CNNs.
Try bash scripts/256_kd_object_inception2_train.sh
to train knowledge disambiguation networks (by object network).
Try bash scripts/256_kd_scene_inception2_train.sh
to train knowledge disambiguation netowrks (by scene network).
The training code is based on our modified Caffe toolbox. It is a efficient parallel caffe with MPI implementation. Meanwhile, we implement a new kl-divergence loss layer for our knowledge disambiguation methods;
https://github.com/yjxiong/caffe/tree/kd
Questions
Contact