• Stars
    star
    118
  • Rank 299,923 (Top 6 %)
  • Language
    MATLAB
  • License
    Other
  • Created over 8 years ago
  • Updated over 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Development kit for the data of the Places365-Standard and Places365-Challenge

Places365 Data Development Kit

Introduction

This is the documentation of the Places365 data development kit. If you want the Places-CNN models instead of the training data, please refer to the Places365-models.

Table of contents:

  • Overview of Places Database
  • Image data details for Places365-Standard and Places365-Challenge
    1. Image list and annotations
    2. Submission format
    3. Evaluation routines
  • Overview of the Places-extra69 data

Please contact Bolei Zhou ([email protected]) for questions, comments, or bug reports.

Downloads

  • First download the image list and annotations for Places365-Standard and the image list and annotations for Place365-Challenge, and decompress the files in the data folder. This file only contains image list, without actual images.

  • Download the corresponding compressed image files at here. This file contains the actual images of Places database.

Overview of Places365-Standard Data

The category list of the Places365 is at categories_places365.txt. There are three types of image data for Places365-Standard: training data from Places365-Standard (TRAINING), validation data (VALIDATION), and test data (TEST). There is no overlap in the three sources of data: TRAINING, VALIDATION, and TEST. All three sets of data contain images of 365 categories of scenes.

             Number of images

Dataset      TRAIN      VALIDATION     TEST

Places365-Standard    1,803,460      36,500      328,500

Every image in training, validation and test sets has a single image-level label specifying the presence of one scene category

Places365-Standard statistics:

Training:

- 1,803,460 images, with between 3,068 and 5,000 per category

Validation:

- 36,500 images, with 100 images per category

Test:

- 328,500 images, with 900 images per category

Packaging details:

The 3 sets of images (training, validation and test) are available as 3 tar archives. All images are in JPEG format. We provide both the original images and images resized to 256*256 for download.

Overview of Places365-Challenge Data

There are three types of image data for this competition: training data from Places365-Challenge (TRAINING), validation data (VALIDATION), and test data (TEST). There is no overlap in the three sources of data: TRAINING, VALIDATION, and TEST. All three sets of data contain images of 365 categories of scenes. The VALIDATION and TEST are the same as the Places365-Standard. The first 5000 images (or less as it is bounded by the total image number in that category) in each category are the images from Places365-Standard train set.

             Number of images

Dataset      TRAIN      VALIDATION     TEST

Places365-Challenge    8,026,628      36,500      328,500

Every image in training, validation and test sets has a singleimage-level label specifying the presence of one scene category

Places365-Challenge statistics:

Training:

- 8,026,628 images, with between 3068 and 40,000 per category

Validation:

- 36,500 images, with 100 images per category

Test:

- 328,500 images, with 900 images per category

Packaging details:

The 3 sets of images (training, validation and test) are available as 3 tar archives. All images are in JPEG format. We provide both the original images and images resized to 256*256 for download.

Details of the data

The 365 scene categories used in the challenge dataset are part of the Places2 dataset.

All the class names and ids are available in: categories_places365.txt, where each line contains the scene category name followed by its id (an integer between 0 and 364).

The difference betweee Places365-Challenge and Places365-Standard is that there are ~6.2million more extra images in Places365-challenge compared to Places365-standard. The first 5000 images (or less) per category in Places365-challenge belong to the Places365-standard.

1 Training data

Each image is considered as belonging to a particular scene category. See [1] for more details of the collection and labeling strategy.

The training images may be downloaded as a single tar archive. Within it there is a tar file for each alphabet from 'a.tar' to 'z.tar'. Note that there are 24 such files as there are no scene category names beginning with 'q' and 'x' in our database.

After untarring all of the above files, the directory structure should look similar to the following: a/abbey/00000000.jpg a/abbey/00000001.jpg ... z/zen_garden/00009067.jpg
z/zen_garden/00009068.jpg

In general, each leaf folder contains one scene category. Note that there are some categories that are fine-grained, e.g., s/swimming_pool/indoor and s/swimming_pool/outdoor. The complete list of training images and their mapping to scene category ids is available in: data/places365_train_challenge.txt

All images are in JPEG format. We also include the data/places365_train_standard.txt here, you don't need to use it.

2 Validation data

There are a total of 36,500 validation images. They are named as

  Places365_val_00000001.jpg
  Places365_val_00000002.jpg
  ...
  Places365_val_00036499.jpg
  Places365_val_00036500.jpg

There are 100 validation images for each scene category.

The classification ground truth of the validation images is in data/places365_val.txt,

where each line contains one image filename and its corresponding scene category label (from 0 to 364).

3 Test data

There are a total of 328,500 test images. The test files are named as

  Places365_test_00000001.jpg
  Places365_test_00000002.jpg
  ...
  Places365_test_00328499.jpg
  Places365_test_00328500.jpg

There are 900 test images for each scene category. The ground truth annotations will not be released.

Submission format

The submission of results on test data will consist of a text file with one line per image, in the alphabetical order of the image file names, i.e. from Places365_test_00000001.jpg to Places365_test_00328500.jpg. Each line contains up to 5 detected scenes, sorted by confidence in descending order.

The format is as follows:

<label(1)> <label(2)> <label(3)> <label(4)> <label(5)>

The predicted labels are the scene categories ( integers between 0 and 364 ). The number of labels per line must be exactly equal to 5, or it would lead to an error. The filename is the same as mentioned above, e.g., 'Places365_test_00000001.jpg' and so on.

Example file on the validation data is

evaluation/demo.val.pred.txt

Evaluation routines

The Matlab routine for evaluating the submission is

./evaluation/eval_cls.m

To see an example of using the routines, start Matlab in the 'evaluation/' folder and type demo_eval_cls;

and you will see something similar to the following output:

PLACES365 SCENE CLASSIFICATION TASK pred_file: demo.val.pred.txt ground_truth_file: ../data/places365_val.txt

guesses vs cls error

1.0000    0.9974
2.0000    0.9944
3.0000    0.9920
4.0000    0.9893
5.0000    0.9867

In this demo, we take top i ( i=1...5) predictions (and ignore the rest) from your result file and plot the error as a function of the number of guesses.

Only the error with 5 guesses will be used to determine the winner.

(The demo.val.pred.txt used here is a synthetic result.)

Overview of the Places-Extra69 Data

Totally there are 434 scene categories in the Places Database. Besides the data of the Places365 we released above, here we release the data of the extra 69 scene categories. The category list of the extra 69 categories is at here, where each line contains the scene category name followed by its id (an integer between 0 and 68). Download the images at the project page. There are the splits of train and test in the compressed file. For each category, we leave 100 images out as the test images. There are 98,721 images for training and 6,600 images for testing. For those categories which don't have 100 enough images, we don't include them in the testing split.

Potentially this Places-extra69 data could be used for one-shot learning or few-shot learning, or some transfer learning research.

Reference

Link: Places2 Database, Places1 Database

Please cite the following IEEE Transaction on Pattern Analysis and Machine Intelligence paper if you use the data or pre-trained CNN models.

 @article{zhou2017places,
   title={Places: A 10 million Image Database for Scene Recognition},
   author={Zhou, Bolei and Lapedriza, Agata and Khosla, Aditya and Oliva, Aude and Torralba, Antonio},
   journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
   year={2017},
   publisher={IEEE}
 }

License

The pre-trained Places-CNN models can be used under the Creative Common License (Attribution CC BY). Please give appropriate credit to our work, such as providng a link to the paper or the Places project page. The copyright of all the images from the Places and Places2 database belongs to the image owners.

More Repositories

1

introRL

Intro to Reinforcement Learning (强化学习纲要)
3,195
star
2

CAM

Class Activation Mapping
MATLAB
1,839
star
3

bolei_awesome_posters

CVPR and NeurIPS poster examples and templates. May we have in-person poster session soon!
1,362
star
4

TRN-pytorch

Temporal Relation Networks
Python
787
star
5

moments_models

The pretrained models trained on Moments in Time Dataset
Python
355
star
6

cnnvisualizer

Visualizer for Deep Neural Networks
C
292
star
7

VQAbaseline

Simple Baseline for Visual Question Answering
Lua
186
star
8

awesome-generative-modeling

Bolei's archive on generative modeling
157
star
9

GoSpark

Go
77
star
10

TRN-deprecate

Code for Temporal Relation Networks
24
star
11

awesome-neural-planner

personal paper reading on neural motion planner and controller
24
star
12

collectiveness

The source codes in the CVPR2013 Paper: Measuring Crowd Collectiveness
MATLAB
24
star
13

introGM

Tutorial on Generative Modeling: Interacting with Deep Generative Models for Content Creation
20
star
14

GKLT

The binary code of generalized KLT tracker
C
19
star
15

CohFilter

MATLAB
10
star
16

pytorch_imagecaptioning

Jupyter Notebook
9
star
17

moments_recognition

Python
7
star
18

cuhkcourse_multimedia

Code examples for the IERG4190/IEMS5707 course
Jupyter Notebook
6
star
19

RF_topic

C++
5
star
20

feature_invertion_torch

Lua
4
star
21

cvpr19_textureobjectscene

webpage for the CVPR'19 Tutorial on Textures, Objects, and Scenes
HTML
3
star
22

ierg3050simulation

Example code for the IERG3050 Simulation and Statistical Analysis
Jupyter Notebook
3
star
23

Hello-World

first blood on Github
Objective-C
1
star
24

deepmodel

course project for Advance in Computer Vision
Python
1
star
25

coursera

codes for online courses
Python
1
star
26

deepfeature

The toolkit to evaluate the deep features for visual recognition
1
star