• Stars
    star
    3,050
  • Rank 14,770 (Top 0.3 %)
  • Language
    Python
  • License
    Other
  • Created about 6 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Largest multi-label image database; ResNet-101 model; 80.73% top-1 acc on ImageNet

Tencent ML-Images

This repository introduces the open-source project dubbed Tencent ML-Images, which publishes

  • ML-Images: the largest open-source multi-label image database, including 17,609,752 training and 88,739 validation image URLs, which are annotated with up to 11,166 categories
  • Resnet-101 model: it is pre-trained on ML-Images, and achieves the top-1 accuracy 80.73% on ImageNet via transfer learning

Updates

  • [2019/12/26] Our manuscript of this open-source project has been accepted to IEEE Access (Journal, ArXiv). It presents more details of the database, the loss function, the training algorithm, and more experimental results.
  • [2018/12/19] We simplify the procedure of downloading images. Please see Download Images.

Contents

Dependencies

Data

[back to top]

Image Source

[back to top]

The image URLs of ML-Images are collected from ImageNet and Open Images. Specifically,

  • Part 1: From the whole database of ImageNet, we adopt 10,706,941 training and 50,000 validation image URLs, covering 10,032 categories.
  • Part 2: From Open Images, we adopt 6,902,811 training and 38,739 validation image URLs, covering 1,134 unique categories (note that some other categories are merged with their synonymous categories from ImageNet).

Finally, ML-Images includes 17,609,752 training and 88,739 validation image URLs, covering 11,166 categories.

Download Images

[back to top]

Due to the copyright, we cannot provide the original images directly. However, one can obtain all images of our database using the following files:

Download Images from ImageNet

We find that massive urls provided by ImageNet have expired (please check the file List of all image URLs of Fall 2011 Release at http://image-net.org/download-imageurls). Thus, here we provide the original image IDs of ImageNet used in our database. One can obtain the training/validation images of our database through the following steps:

  • Download the whole database of ImageNet
  • Extract the training/validation images using the image IDs in train_image_id_from_imagenet.txt and val_image_id_from_imagenet.txt

The format of train_image_id_from_imagenet.txt is as follows:

...
n04310904/n04310904_8388.JPEG   2367:1  2172:1  1831:1  1054:1  1041:1  865:1   2:1
n11753700/n11753700_1897.JPEG   5725:1  5619:1  5191:1  5181:1  5173:1  5170:1  1042:1  865:1   2:1
...

As shown above, one image corresponds to one row. The first term is the original image ID of ImageNet. The followed terms separated by space are the annotations. For example, "2367:1" indicates class 2367 and its confidence 1. Note that the class index starts from 0, and you can find the class name from the file data/dictionary_and_semantic_hierarchy.txt.

NOTE: We find that there are some repeated URLs in List of all image URLs of Fall 2011 Release of ImageNet, i.e., the image corresponding to one URL may be stored in multiple sub-folders with different image IDs. We manually check a few repeated images, and find the reason is that one image annotated with a child class may also be annotated with its parent class, then it is saved to two sub-folders with different image IDs. To the best of our knowledge, this point has never been claimed in ImageNet or any other place. If one want to use ImageNet, this point should be noticed. Due to that, there are also a few repeated images in our database, but our training is not significantly influenced. In future, we will update the database by removing the repeated images.

Download Images from Open Images

The images from Open Images can be downloaded using URLs. The format of train_urls_from_openimages.txt is as follows:

...
https://c4.staticflickr.com/8/7239/6997334729_e5fb3938b1_o.jpg  3:1  5193:0.9  5851:0.9 9413:1 9416:1
https://c2.staticflickr.com/4/3035/3033882900_a9a4263c55_o.jpg  1053:0.8  1193:0.8  1379:0.8
...

As shown above, one image corresponds to one row. The first term is the image URL. The followed terms separated by space are the annotations. For example, "5193:0.9" indicates class 5193 and its confidence 0.9.

Download Images using URLs

We also provide the code to download images using URLs. As train_urls_from_openimages.txt is very large, here we provide a tiny file train_urls_tiny.txt to demonstrate the downloading procedure.

cd data
./download_urls_multithreading.sh

A sub-folder data/images will be generated to save the downloaded jpeg images, as well as a file train_im_list_tiny.txt to save the image list and the corresponding annotations.

Semantic Hierarchy

[back to top]

We build the semantic hiearchy of 11,166 categories, according to WordNet. The direct parent categories of each class can be found from the file data/dictionary_and_semantic_hierarchy.txt. The whole semantic hierarchy includes 4 independent trees, of which the root nodes are thing, matter, object, physical object and atmospheric phenomenon, respectively. The length of the longest semantic path from root to leaf nodes is 16, and the average length is 7.47.

Annotations

[back to top]

Since the image URLs of ML-Images are collected from ImageNet and Open Images, the annotations of ML-Images are constructed based on the original annotations from ImageNet and Open Images. Note that the original annotations from Open Images are licensed by Google Inc. under CC BY-4.0. Specifically, we conduct the following steps to construct the new annotations of ML-Images.

  • For the 6,902,811 training URLs from Open Images, we remove the annotated tags that are out of the remained 1,134 categories.
  • According to the constructed semantic hierarchy of 11,166 categories, we augment the annotations of all URLs of ML-Images following the cateria that if one URL is annotated with category i, then all ancestor categories will also be annotated to this URL.
  • We train a ResNet-101 model based on the 6,902,811 training URLs from Open Images, with 1,134 outputs. Using this ResNet-101 model, we predict the tags from 1,134 categories for the 10,756,941 single-annotated image URLs from ImageNet. Consequently, we obtain a normalized co-occurrence matrix between 10,032 categories from ImageNet and 1,134 categories from Open Images. We can determine the strongly co-occurrenced pairs of categories. For example, category i and j are strongly co-occurrenced; then, if one image is annotated with category i, then category j should also be annotated.

The annotations of all URLs in ML-Images are stored in train_urls.txt and val_urls.txt.

Statistics

[back to top]

The main statistics of ML-Images are summarized in ML-Images.

# Train images # Validation images # Classes # Trainable Classes # Avg tags per image # Avg images per class
17,609,752 88,739 11,166 10,505 8.72 13,843

Note: Trainable class indicates the class that has over 100 train images.


The number of images per class and the histogram of the number of annotations in training set are shown in the following figures.

GitHub GitHub

Train

[back to top]

Prepare the TFRecord File

[back to top]

Here we generate the tfrecords using the multithreading module. One should firstly split the file train_im_list_tiny.txt into multiple smaller files, and save them into the sub-folder data/image_lists/.

cd data
./tfrecord.sh

Multiple tfrecords (named like x.tfrecords) will saved to data/tfrecords/.

Pretrain on ML-Images

[back to top]

Before training, one should move the train and validation tfrecords to data/ml-images/train and data/ml-images/val, respectively. Then,

./example/train.sh

Note: Here we only provide the training code in the single node single GPU framework, while our actual training on ML-Images is based on an internal distributed training framework (not released yet). One could modify the training code to the distributed framework following distributed tensorFlow.

Finetune on ImageNet

[back to top]

One should firstly download the ImageNet (ILSVRC2012) database, then prepare the tfrecord file using tfrecord.sh. Then, you can finetune the ResNet-101 model on ImageNet as follows, with the checkpoint pre-trained on ML-Images.

./example/finetune.sh

Checkpoints

[back to top]

  • ckpt-resnet101-mlimages (link1, link2): pretrained on ML-Images
  • ckpt-resnet101-mlimages-imagenet (link1, link2): pretrained on ML-Images and finetuned on ImageNet (ILSVRC2012)

Please download above two checkpoints and move them into the folder checkpoints/, if you want to extract features using them.

Single-Label Image Classification

Here we provide a demo for single-label image-classification, using the checkpoint ckpt-resnet101-mlimages-imagenet downloaded above.

./example/image_classification.sh

The prediction will be saved to label_pred.txt. If one wants to recognize other images, data/im_list_for_classification.txt should be modified to include the path of these images.

Feature Extraction

[back to top]

./example/extract_feature.sh

Results

[back to top]

The retults of different ResNet-101 checkpoints on the validation set of ImageNet (ILSVRC2012) are summarized in the following table.

Checkpoints Train and finetune setting Top-1 acc
on Val 224
Top-5 acc
on Val 224
Top-1 acc
on Val 299
Top-5 acc
on Val 299
MSRA ResNet-101 train on ImageNet 76.4 92.9 -- --
Google ResNet-101 ckpt1 train on ImageNet, 299 x 299 -- -- 77.5 93.9
Our ResNet-101 ckpt1 train on ImageNet 77.8 93.9 79.0 94.5
Google ResNet-101 ckpt2 Pretrain on JFT-300M, finetune on ImageNet, 299 x 299 -- -- 79.2 94.7
Our ResNet-101 ckpt2 Pretrain on ML-Images, finetune on ImageNet 78.8 94.5 79.5 94.9
Our ResNet-101 ckpt3 Pretrain on ML-Images, finetune on ImageNet 224 to 299 78.3 94.2 80.73 95.5
Our ResNet-101 ckpt4 Pretrain on ML-Images, finetune on ImageNet 299 x 299 75.8 92.7 79.6 94.6

Note:

  • if not specified, the image size in training/finetuning is 224 x 224.
  • finetune on ImageNet from 224 to 299 means that the image size in early epochs of finetuning is 224 x 224, then 299 x 299 in late epochs.
  • Top-1 acc on Val 224 indicates the top-1 accuracy on 224 x 224 validation images.

Copyright

[back to top]

The annotations of images are licensed by Tencent under CC BY 4.0 license. The contents of this repository, including the codes, documents and checkpoints, are released under an BSD 3-Clause license. Please refer to LICENSE for more details.

If there is any concern about the copyright of any image used in this project, please email us.

Citation

[back to top]

If any content of this project is utilized in your work (such as data, checkpoint, code, or the proposed loss or training algorithm), please cite the following manuscript.

@article{tencent-ml-images-2019,
  title={Tencent ML-Images: A Large-Scale Multi-Label Image Database for Visual Representation Learning},
  author={Wu, Baoyuan and Chen, Weidong and Fan, Yanbo and Zhang, Yong and Hou, Jinlong and Liu, Jie and Zhang, Tong},
  journal={IEEE Access},
  volume={7},
  year={2019}
}

More Repositories

1

weui

A UI library by WeChat official design team, includes the most useful widgets/modules in mobile web applications.
Less
27,140
star
2

wepy

小程序组件化开发框架
JavaScript
22,491
star
3

ncnn

ncnn is a high-performance neural network inference framework optimized for the mobile platform
C++
19,861
star
4

mars

Mars is a cross-platform network component developed by WeChat.
C++
17,294
star
5

tinker

Tinker is a hot-fix solution library for Android, it supports dex, library and resources update without reinstall apk.
Java
17,164
star
6

MMKV

An efficient, small mobile key-value storage framework developed by WeChat. Works on Android, iOS, macOS, Windows, and POSIX.
C++
17,138
star
7

APIJSON

🏆 实时 零代码、全功能、强安全 ORM 库 🚀 后端接口和文档零代码,前端(客户端) 定制返回 JSON 的数据和结构 🏆 Real-Time coding-free, powerful and secure ORM 🚀 providing APIs and Docs without coding by Backend, and the returned JSON of API can be customized by Frontend(Client) users
Java
17,052
star
8

vConsole

A lightweight, extendable front-end developer tool for mobile web page.
TypeScript
16,716
star
9

weui-wxss

A UI library by WeChat official design team, includes the most useful widgets/modules.
Less
15,070
star
10

QMUI_Android

提高 Android UI 开发效率的 UI 库
Java
14,423
star
11

rapidjson

A fast JSON parser/generator for C++ with both SAX/DOM style API
C++
14,163
star
12

secguide

面向开发人员梳理的代码安全指南
13,203
star
13

omi

Web Components Framework - Web组件框架
TypeScript
13,048
star
14

VasSonic

VasSonic is a lightweight and high-performance Hybrid framework developed by tencent VAS team, which is intended to speed up the first screen of websites working on Android and iOS platform.
Java
11,801
star
15

matrix

Matrix is a plugin style, non-invasive APM system developed by WeChat.
Java
11,623
star
16

wcdb

WCDB is a cross-platform database framework developed by WeChat.
C
10,793
star
17

xLua

xLua is a lua programming solution for C# ( Unity, .Net, Mono) , it supports android, ios, windows, linux, osx, etc.
C
9,296
star
18

libco

libco is a coroutine library which is widely used in wechat back-end service. It has been running on tens of thousands of machines since 2013.
C++
8,223
star
19

Hippy

Hippy is designed to easily build cross-platform dynamic apps. 👏
C++
7,983
star
20

Shadow

零反射全动态Android插件框架
Java
7,382
star
21

QMUI_iOS

QMUI iOS——致力于提高项目 UI 开发效率的解决方案
Objective-C
7,084
star
22

lemon-cleaner

腾讯柠檬清理是针对macOS系统专属制定的清理工具。主要功能包括重复文件和相似照片的识别、软件的定制化垃圾扫描、可视化的全盘空间分析、内存释放、浏览器隐私清理以及设备实时状态的监控等。重点聚焦清理功能,对上百款软件提供定制化的清理方案,提供专业的清理建议,帮助用户轻松完成一键式清理。
Objective-C
5,421
star
23

MLeaksFinder

Find memory leaks in your iOS app at develop time.
Objective-C
5,419
star
24

libpag

The official rendering library for PAG (Portable Animated Graphics) files that renders After Effects animations natively across multiple platforms.
C++
4,943
star
25

puerts

PUER(普洱) Typescript. Let's write your game in UE or Unity with TypeScript.
C++
4,902
star
26

kbone

一个致力于微信小程序和 Web 端同构的解决方案
JavaScript
4,772
star
27

TNN

TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its cross-platform capability, high performance, model compression and code pruning. Based on ncnn and Rapidnet, TNN further strengthens the support and performance optimization for mobile devices, and also draws on the advantages of good extensibility and high performance from existed open source efforts. TNN has been deployed in multiple Apps from Tencent, such as Mobile QQ, Weishi, Pitu, etc. Contributions are welcome to work in collaborative with us and make TNN a better framework.
C++
4,388
star
28

GT

GT (Great Tit) is a portable debugging tool for bug hunting and performance tuning on smartphones anytime and anywhere just as listening music with Walkman. GT can act as the Integrated Debug Environment by directly running on smartphones.
Java
4,387
star
29

westore

小程序项目分层架构
JavaScript
4,244
star
30

tmagic-editor

TypeScript
4,190
star
31

wujie

极致的微前端框架
TypeScript
4,023
star
32

vap

VAP是企鹅电竞开发,用于播放特效动画的实现方案。具有高压缩率、硬件解码等优点。同时支持 iOS,Android,Web 平台。
Objective-C
3,898
star
33

cherry-markdown

✨ A Markdown Editor
JavaScript
3,505
star
34

HunyuanDiT

Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Python
3,358
star
35

phxpaxos

The Paxos library implemented in C++ that has been used in the WeChat production environment.
C++
3,351
star
36

WeFlow

A web developer workflow tool by WeChat team based on tmt-workflow, with cross-platform supported and environment ready.
JavaScript
3,224
star
37

spring-cloud-tencent

Spring Cloud Tencent is a Spring Cloud based Service Governance Framework provided by Tencent.
Java
3,171
star
38

weui.js

A lightweight javascript library for WeUI.
JavaScript
3,167
star
39

tdesign

Enterprise Design System
Vue
3,156
star
40

VasDolly

Android V1 and V2 Signature Channel Package Plugin
Java
3,061
star
41

Tendis

Tendis is a high-performance distributed storage system fully compatible with the Redis protocol.
C++
2,934
star
42

FaceDetection-DSFD

腾讯优图高精度双分支人脸检测器
Python
2,885
star
43

PhoenixGo

Go AI program which implements the AlphaGo Zero paper
C++
2,871
star
44

behaviac

behaviac is a framework of the game AI development, and it also can be used as a rapid game prototype design tool. behaviac supports the behavior tree, finite state machine and hierarchical task network(BT, FSM, HTN)
C#
2,829
star
45

PocketFlow

An Automatic Model Compression (AutoMC) framework for developing smaller and faster AI applications.
Python
2,783
star
46

MSEC

Mass Service Engine in Cluster(MSEC) is opened source by QQ team from Tencent. It is a backend DEV &OPS engine, including RPC,name finding,load balance,monitoring,release and capacity management.
Java
2,746
star
47

phxsql

A high availability MySQL cluster that guarantees data consistency between a master and slaves.
C++
2,470
star
48

OOMDetector

OOMDetector is a memory monitoring component for iOS which provides you with OOM monitoring, memory allocation monitoring, memory leak detection and other functions.
Objective-C++
2,315
star
49

tsf

coroutine and Swoole based php server framework in tencent
PHP
2,179
star
50

tmt-workflow

A web developer workflow used by WeChat team based on Gulp, with cross-platform supported and solutions prepared.
CSS
2,173
star
51

UnLua

A feature-rich, easy-learning and highly optimized Lua scripting plugin for UE.
C++
2,169
star
52

Hardcoder

Hardcoder is a solution which allows Android APP and Android System to communicate with each other directly, solving the problem that Android APP could only use system standard API rather than the hardware resource of system.
C++
2,155
star
53

LKImageKit

A high-performance image framework, including a series of capabilities such as image views, image downloader, memory caches, disk caches, image decoders and image processors.
Objective-C
2,082
star
54

GameAISDK

基于图像的游戏AI自动化框架
C++
2,030
star
55

TubeMQ

TubeMQ has been donated to the Apache Software Foundation and renamed to InLong, please visit the new Apache repository: https://github.com/apache/incubator-inlong
2,022
star
56

phxrpc

A simple C++ based RPC framework.
C++
1,974
star
57

TscanCode

A static code analyzer for C++, C#, Lua
C++
1,972
star
58

ObjectDetection-OneStageDet

单阶段通用目标检测器
Python
1,966
star
59

InjectFix

InjectFix is a hot-fix solution library for Unity
C#
1,961
star
60

cloudbase-framework

腾讯云开发云原生一体化部署工具 🚀 CloudBase Framework:一键部署,不限框架语言,云端一体化开发,基于Serverless 架构。A front-end and back-end integrated deployment tool. One-click deploy to serverless architecture. https://docs.cloudbase.net/framework/index
JavaScript
1,937
star
61

soter

A secure and quick biometric authentication standard and platform in Android held by Tencent.
Java
1,928
star
62

phxqueue

A high-availability, high-throughput and highly reliable distributed queue based on the Paxos algorithm.
C++
1,899
star
63

plato

腾讯高性能分布式图计算框架Plato
C++
1,895
star
64

MedicalNet

Many studies have shown that the performance on deep learning is significantly affected by volume of training data. The MedicalNet project provides a series of 3D-ResNet pre-trained models and relative code.
Python
1,888
star
65

NeuralNLP-NeuralClassifier

An Open-source Neural Hierarchical Multi-label Text Classification Toolkit
Python
1,834
star
66

TSW

Tencent Server Web
TypeScript
1,804
star
67

sluaunreal

lua dev plugin for unreal engine 4 or 5
C++
1,780
star
68

QMUI_Web

An efficient front-end framework for developers building UI on the web.
JavaScript
1,719
star
69

Metis

Metis is a learnware platform in the field of AIOps.
Python
1,690
star
70

Biny

Biny is a tiny, high-performance PHP framework for web applications
PHP
1,687
star
71

paxosstore

PaxosStore has been deployed in WeChat production for more than two years, providing storage services for the core businesses of WeChat backend. Now PaxosStore is running on thousands of machines, and is able to afford billions of peak TPS.
C++
1,675
star
72

CodeAnalysis

Static Code Analysis - 静态代码分析
Python
1,639
star
73

MimicMotion

High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
Python
1,475
star
74

TurboTransformers

a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.
C++
1,464
star
75

tdesign-vue-next

A Vue3.x UI components lib for TDesign.
TypeScript
1,434
star
76

nohost

基于 Whistle 实现的多账号多环境远程配置及抓包调试平台
JavaScript
1,423
star
77

TencentOS-kernel

腾讯针对云的场景研发的服务器操作系统
1,408
star
78

TBase

TBase is an enterprise-level distributed HTAP database. Through a single database cluster to provide users with highly consistent distributed database services and high-performance data warehouse services, a set of integrated enterprise-level solutions is formed.
C
1,383
star
79

WeDemo

WeDemo为微信团队开源项目,用于帮助微信开发者完成微信登录、微信分享等功能的接入和开发。开发者可参考源代码完成开发,也可以直接将代码应用到自己的App开发中,安全、便捷地在App中实现微信分享、微信登录功能。
Objective-C
1,374
star
80

feflow

🚀 A command line tool aims to improve front-end engineer workflow and standard, powered by TypeScript.
TypeScript
1,360
star
81

GAutomator

Automation for mobile games
Objective-C
1,331
star
82

flare

Flare是广泛投产于腾讯广告后台的现代化C++开发框架,包含了基础库、RPC、各种客户端等。主要特点为易用性强、长尾延迟低。
C++
1,308
star
83

TFace

A trusty face analysis research platform developed by Tencent Youtu Lab
Python
1,306
star
84

LuaPanda

lua debug and code tools for VS Code
Lua
1,235
star
85

FeatherCNN

FeatherCNN is a high performance inference engine for convolutional neural networks.
C++
1,208
star
86

tdesign-miniprogram

A Wechat MiniProgram UI components lib for TDesign.
HTML
1,204
star
87

tquic

A high-performance, lightweight, and cross-platform QUIC library
Rust
1,086
star
88

tgfx

A lightweight 2D graphics library for rendering texts, geometries, and images with high-performance APIs that work across various platforms.
C++
1,057
star
89

TencentPretrain

Tencent Pre-training framework in PyTorch & Pre-trained Model Zoo
Python
1,005
star
90

RapidView

RapidView is an android ui and lightapp development framework
Java
979
star
91

hel

A module federation SDK which is unrelated to tool chain for module consumer. 工具链无关的运行时模块联邦sdk.
JavaScript
959
star
92

TencentKona-8

Tencent Kona is a no-cost, production-ready distribution of the Open Java Development Kit (OpenJDK), Long-term support(LTS) with quarterly updates. Tencent Kona serves as the default JDK internally at Tencent Cloud for cloud computing and other Java applications.
Java
942
star
93

FAutoTest

A UI automated testing framework for H5 and applets
Python
932
star
94

tdesign-vue

A Vue.js UI components lib for TDesign.
TypeScript
914
star
95

Pebble

Pebble分布式开发框架
C++
866
star
96

mxflutter

使用 TypeScript/JavaScript 来开发 Flutter 应用的框架。
Dart
857
star
97

Face2FaceTranslator

面对面翻译小程序是微信团队针对面对面沟通的场景开发的流式语音翻译小程序,通过微信同声传译插件提供了语音识别,文本翻译等功能。
JavaScript
836
star
98

tdesign-react

A React UI components lib for TDesign.
TypeScript
821
star
99

LightDiffusionFlow

This extension is developed for AUTOMATIC1111's Stable Diffusion web UI that provides import/export options for parameters.
JavaScript
798
star
100

Real-SR

Real-World Super-Resolution via Kernel Estimation and Noise Injection
Python
769
star