Facial Expression Recognition using Residual Masking Network
The code for my undergraduate thesis.
Inference:
- Install from pip
pip install rmn
# or build from source
git clone [email protected]:phamquiluan/ResidualMaskingNetwork.git
cd ResidualMaskingNetwork
pip install -e .
- Run demo in Python (with webcam available)
from rmn import RMN
m = RMN()
m.video_demo()
- Detect emotions in single images
image = cv2.imread("some-image-path.png")
results = m.detect_emotion_for_single_frame(image)
print(results)
image = m.draw(image, results)
cv2.imwrite("output.png", image)
Table of Contents
- Recent Update
- Benchmarking on FER2013
- Benchmarking on ImageNet
- Installation
- Download datasets
- Training on FER2013
- Training on ImageNet
- Evaluation results
- Download dissertation and slide
   Â
Recent Update
- [07/03/2023] Re-structure, update Readme
- [05/05/2021] Release ver 2, add colab
- [27/02/2021] Add paper
- [14/01/2021] Packaging Project and publish
rmn
on Pypi - [27/02/2020] Update Tensorboard visualizations and Overleaf source
- [22/02/2020] Test-time augmentation implementation.
- [21/02/2020] Imagenet training code and trained weights released.
- [21/02/2020] Imagenet evaluation results released.
- [10/01/2020] Checking demo stuff and training procedure works on another machine
- [09/01/2020] First time upload
Benchmarking on FER2013
We benchmark our code thoroughly on two datasets: FER2013 and VEMO. Below are the results and trained weights:
Model | Accuracy |
---|---|
VGG19 | 70.80 |
EfficientNet_b2b | 70.80 |
Googlenet | 71.97 |
Resnet34 | 72.42 |
Inception_v3 | 72.72 |
Bam_Resnet50 | 73.14 |
Densenet121 | 73.16 |
Resnet152 | 73.22 |
Cbam_Resnet50 | 73.39 |
ResMaskingNet | 74.14 |
ResMaskingNet + 6 | 76.82 |
Results in VEMO dataset could be found in my thesis or slide (attached below)
Benchmarking on ImageNet
We also benchmark our model on ImageNet dataset.
Model | Top-1 Accuracy | Top-5 Accuracy |
---|---|---|
Resnet34 | 72.59 | 90.92 |
CBAM Resnet34 | 73.77 | 91.72 |
ResidualMaskingNetwork | 74.16 | 91.91 |
Installation
- Install PyTorch by selecting your environment on the website and running the appropriate command.
- Clone this repository and install package prerequisites below.
- Then download the dataset by following the instructions below.
Datasets
- FER2013 Dataset (locate it in
saved/data/fer2013
likesaved/data/fer2013/train.csv
) - ImageNet 1K Dataset (ensure it can be loaded by torchvision.datasets.Imagenet)
Training on FER2013
- To train network, you need to specify model name and other hyperparameters in config file (located at configs/*) then ensure it is loaded in main file, then run training procedure by simply run main file, for example:
python main_fer.py # Example for fer2013_config.json file
- The best checkpoints will chosen at term of best validation accuracy, located at
saved/checkpoints
- The TensorBoard training logs are located at
saved/logs
, to open it, usetensorboard --logdir saved/logs/
- By default, it will train
alexnet
model, you can switch to another model by editconfigs/fer2013\_config.json
file (toresnet18
orcbam\_resnet50
or my networkresmasking\_dropout1
.
Training on Imagenet dataset
To perform training resnet34 on 4 V100 GPUs on a single machine:
python ./main_imagenet.py -a resnet34 --dist-url 'tcp://127.0.0.1:12345' --dist-backend 'nccl' --multiprocessing-distributed --world-size 1 --rank 0
Evaluation
For student, who takes care of font family of confusion matrix and would like to write things in LaTeX, below is an example for generating a striking confusion matrix.
(Read this article for more information, there will be some bugs if you blindly run the code without reading).
python cm_cbam.py
Ensemble method
I used no-weighted sum avarage ensemble method to fusing 7 different models together, to reproduce results, you need to do some steps:
- Download all needed trained weights and located on
./saved/checkpoints/
directory. Link to download can be found on Benchmarking section. - Edit file
gen_results
and run it to generate result offline for each model. - Run
gen_ensemble.py
file to generate accuracy for example methods.
Dissertation and Slide
- Dissertation PDF (in Vietnamese)
- Dissertation Overleaf Source
- Presentation slide PDF (in English) with full appendix
- Presentation slide Overleaf Source
- Paper
Authors
Citation
Pham, Luan, The Huynh Vu, and Tuan Anh Tran. "Facial expression recognition using residual masking network." 2020 25Th international conference on pattern recognition (ICPR). IEEE, 2021.
@inproceedings{pham2021facial,
title={Facial expression recognition using residual masking network},
author={Pham, Luan and Vu, The Huynh and Tran, Tuan Anh},
booktitle={2020 25Th international conference on pattern recognition (ICPR)},
pages={4513--4519},
year={2021},
organization={IEEE}
}