• Stars
    star
    731
  • Rank 61,995 (Top 2 %)
  • Language
    Python
  • Created almost 7 years ago
  • Updated over 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A diverse benchmark database for multi-paradigm facial beauty prediction

SCUT-FBP5500-Database-Release

A diverse benchmark database (Size = 172MB) for multi-paradigm facial beauty prediction is now released by Human Computer Intelligent Interaction Lab of South China University of Technology. The database can be downloaded through the following links:

1 Description

The SCUT-FBP5500 dataset has totally 5500 frontal faces with diverse properties (male/female, Asian/Caucasian, ages) and diverse labels (facial landmarks, beauty scores in 5 scales, beauty score distribution), which allows different computational model with different facial beauty prediction paradigms, such as appearance-based/shape-based facial beauty classification/regression/ranking model for male/female of Asian/Caucasian.

2 Database Construction

The SCUT-FBP5500 Dataset can be divided into four subsets with different races and gender, including 2000 Asian females(AF), 2000 Asian males(AM), 750 Caucasian females(CF) and 750 Caucasian males(CM). Most of the images of the SCUT-FBP5500 were collected from Internet, where some portions of Asian faces were from the DataTang, GuangZhouXiangSu and our laboratory, and some Caucasian faces were from the 10k US Adult Faces database. image

All the images are labeled with beauty scores ranging from [1, 5] by totally 60 volunteers, and 86 facial landmarks are also located to the significant facial components of each images. Specifically, we save the facial landmarks in ‘pts’ format, which can be converted to 'txt' format by running pts2txt.py. We developed several web-based GUI systems to obtain the facial beauty scores and facial landmark locations, respectively.

Training/Testing Set Split

We use two kinds of experimental settings to evaluate the facial beauty prediction methods on SCUT-FBP5500 benchmark, which includes:

  1. 5-folds cross validation. For each validation, 80% samples (4400 images) are used for training and the rest (1100 images) are used for testing.
  2. The split of 60% training and 40% testing. 60% samples (3300 images) are used for training and the rest (2200 images) are used for testing. We have provided the training and testing files in this link.

3 Training Tutorials and Models

We trained three different CNN models (AlexNet, ResNet-18, ResNeXt-50) on SCUT-FBP5500 dataset for facial beauty prediction by using the L2-norm distance loss. Each raw RGB image is resized as 256*256, and then a 227*227 random crop of raw image is obtained to feed into AlexNet, while a 224*224 random crop is sent to ResNet and ResNeXt. The model parameters are initialized by the pretrained CNN models of ImageNet and updated by mini-batch Stochastic Gardient Descent (SGD), where the learning rate is initialized as 0.001 and decreased by a factor of 10 per 5000 iterations. We set the batchsize as 16, momentum coefficient as 0.9, maximum iterations as 20000, and weight decay coefficient as 5e-4 for AlexNet while 1e-4 for ResNet and ResNeXt.

All the experiments were implemented on two different platforms separately, Caffe and Pytorch. And we release the codes of feed-forward implementation and the CNN models that were trained by the data of 'train_1.txt'. Please refer to the 'trained_models_for_caffe' and 'trained_models_for_pytorch' folders for more details.

Trained Models for Caffe

The trained models for Caffe (Size = 322MB) can be downloaded through the following links:

Requirements:

  • Python 2.7
  • Caffe
  • Numpy
  • Matplotlib
  • Scikit-image

Trained Models for Pytorch

And the trained models for Pytorch (Size = 101MB) can be downloaded throught the following link:

  • Download link:

https://pan.baidu.com/s/1OhyJsCMfAdeo8kIZd29yAw (PASSWORD: ateu)

Requirements:

  • Python 2.7
  • Torch 1.0.1
  • Numpy
  • Pillow

4 Benchmark Evaluation

We set AlexNet, ResNet-18, and ResNeXt-50 as the benchmarks of the SCUT-FBP5500 dataset, and we evaluate the benchmark on various measurement metrics, including: Pearson correlation (PC), maximum absolute error (MAE), and root mean square error (RMSE). The evaluation results are shown in the following. Please refer to our paper for more details.

image image

5 Citation and Contact

Please consider to cite our paper when you use our database:

@article{liang2017SCUT,
  title     = {SCUT-FBP5500: A Diverse Benchmark Dataset for Multi-Paradigm Facial Beauty Prediction},
  author    = {Liang, Lingyu and Lin, Luojun and Jin, Lianwen and Xie, Duorui and Li, Mengru},
  jurnal    = {ICPR},
  year      = {2018}
}

Note: The SCUT-FBP5500 database can be only used for non-commercial research purpose.

For any questions about this database please contact the authors by sending email to [email protected] and [email protected].

Desclaimer

This AI algorithm is purely for academic research purpose. The dataset and codes are for academic research use only. We are not responsible for the objectivity and accuracy of the proposed model and algorithm.

More Repositories

1

Scene-Text-Recognition

603
star
2

Scene-Text-Detection

528
star
3

SCUT-HEAD-Dataset-Release

SCUT HEAD is a large-scale head detection dataset, including 4405 images labeld with 111251 heads.
461
star
4

Scene-Text-Recognition-Recommendations

Papers, Datasets, Algorithms, SOTA for STR. Long-time Maintaining
Python
313
star
5

DeRPN

A novel region proposal network for more general object detection ( including scene text detection ).
Python
155
star
6

Scene-Text-End2end

151
star
7

Scene-Text-Removal

EnsNet: Ensconce Text in the Wild
Python
123
star
8

SCUT-EPT_Dataset_Release

The SCUT-EPT Dataset for the research of offline handwritten Chinese text recognition (HCTR) in educational documents has been released.
109
star
9

M6Doc

103
star
10

EPHOIE

101
star
11

SCUT-HCCDoc_Dataset_Release

76
star
12

Forward-Implementation-of-Fast-and-Compact-CNN-for-Offline-HCCR

C++
69
star
13

TKH_MTH_Datasets_Release

The Tripitaka Koreana in Han (TKH) Dataset and the Multiple Tripitaka in Han (MTH) Dataset for the research of Chinese character detection and recognition in historical documents.
60
star
14

SCUT-EnsText

53
star
15

MTHv2_Datasets_Release

50
star
16

MSDS

The official GitHub page of the MSDS dataset.
43
star
17

LAST

Read Ten Lines at One Glance: Line-Aware Semi-Autoregressive Transformer for Multi-Line Handwritten Mathematical Expression Recognition
Python
22
star
18

SCUT_FORU_DB_Release

Flickr OCR Universal Database (SCUT_FORU_DB_Release)
22
star
19

M5HisDoc

21
star
20

Water-Meter-Number-DataSet

The water-meter images are captured by camera and labeled with water-meter number, for the research of the water-meter image recognition.
17
star
21

SCUT-CAB_Dataset_Release

14
star
22

IME_Test

This project can be used to test the recognition rate of Chinese handwriting input method.
Java
7
star
23

EvaluateHandWritingAccuracy

This project can be used to test the recognition rate of Chinese handwriting input method.
Java
4
star
24

IFN_DropRegion_Data

3
star
25

PS_OLHCCR_tmep

2
star
26

DZJ_AnnotationTool

JavaScript
1
star