• Stars
    star
    241
  • Rank 167,643 (Top 4 %)
  • Language
    Python
  • License
    Other
  • Created over 2 years ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Official Pytorch Implementation of SPECTRE: Visual Speech-Aware Perceptual 3D Facial Expression Reconstruction from Videos

SPECTRE: Visual Speech-Aware Perceptual 3D Facial Expression Reconstruction from Videos

Paper   Project WebPage   Youtube Video

Our method performs visual-speech aware 3D reconstruction so that speech perception from the original footage is preserved in the reconstructed talking head. On the left we include the word/phrase being said for each example.

This is the official Pytorch implementation of the paper:

Visual Speech-Aware Perceptual 3D Facial Expression Reconstruction from Videos
Panagiotis P. Filntisis, George Retsinas, Foivos Paraperas-Papantoniou, Athanasios Katsamanis, Anastasios Roussos, and Petros Maragos
arXiv 2022

Installation

Clone the repo and its submodules:

git clone --recurse-submodules -j4 https://github.com/filby89/spectre
cd spectre

You need to have installed a working version of Pytorch with Python 3.6 or higher and Pytorch 3D. You can use the following commands to create a working installation:

conda create -n "spectre" python=3.8
conda install -c pytorch pytorch=1.11.0 torchvision torchaudio # you might need to select cudatoolkit version here by adding e.g. cudatoolkit=11.3
conda install -c conda-forge -c fvcore fvcore iopath 
conda install pytorch3d -c pytorch3d
pip install -r requirements.txt # install the rest of the requirements

Installing a working setup of Pytorch3d with Pytorch can be a bit tricky. For development we used Pytorch3d 0.6.1 with Pytorch 1.10.0.

PyTorch3d 0.6.2 with pytorch 1.11.0 are also compatible.

Install the face_alignment and face_detection packages:

cd external/face_alignment
pip install -e .
cd ../face_detection
git lfs pull
pip install -e .
cd ../..

You may need to install git-lfs to run the above commands. More details

curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
sudo apt-get install git-lfs

Download the FLAME model and the pretrained SPECTRE model:

pip install gdown
bash quick_install.sh

Demo

Samples are included in samples folder. You can run the demo by running

python demo.py --input samples/LRS3/0Fi83BHQsMA_00002.mp4 --audio

The audio flag extracts audio from the input video and puts it in the output shape video for visualization purposes (ffmpeg is required for video creation).

Training and Testing

In order to train the model you need to download the trainval and test sets of the LRS3 dataset. After downloading the dataset, run the following command to extract frames and audio from the videos (audio is not needed for training but it is nice for visualizing the result):

python utils/extract_frames_and_audio.py --dataset_path ./data/LRS3

After downloading and preprocessing the dataset, download the rest needed assets:

bash get_training_data.sh

This command downloads the original DECA pretrained model, the ResNet50 emotion recognition model provided by EMOCA, the pretrained lipreading model and detected landmarks for the videos of the LRS3 dataset provided by Visual_Speech_Recognition_for_Multiple_Languages.

Finally, you need to create a texture model using the repository BFM_to_FLAME. Due to licencing reasons we are not allowed to share it to you.

Now, you can run the following command to train the model:

python main.py --output_dir logs --landmark 50 --relative_landmark 25 --lipread 2 --expression 0.5 --epochs 6 --LRS3_path data/LRS3 --LRS3_landmarks_path data/LRS3_landmarks

and then test it on the LRS3 dataset test set:

python main.py --test --output_dir logs --model_path logs/model.tar --LRS3_path data/LRS3 --LRS3_landmarks_path data/LRS3_landmarks

and run lipreading with AV-hubert:

# and run lipreading with our script
python utils/run_av_hubert.py --videos "logs/test_videos_000000/*_mouth.avi --LRS3_path data/LRS3"

Acknowledgements

This repo is has been heavily based on the original implementation of DECA. We also acknowledge the following repositories which we have benefited greatly from as well:

Citation

If your research benefits from this repository, consider citing the following:

@misc{filntisis2022visual,
  title = {Visual Speech-Aware Perceptual 3D Facial Expression Reconstruction from Videos},
  author = {Filntisis, Panagiotis P. and Retsinas, George and Paraperas-Papantoniou, Foivos and Katsamanis, Athanasios and Roussos, Anastasios and Maragos, Petros},
  publisher = {arXiv},
  year = {2022},
}

More Repositories

1

NTUA-BEEU-eccv2020

Code for the BEEU challenge winning paper.
Python
21
star
2

body-face-emotion-recognition

Code for the paper "Fusing Body Posture with Facial Expressions for Joint Recognition of Affect in Child-Robot Interaction"
Python
19
star
3

expressive-audiovisual-speech-synthesis-GR

Code for the paper "Video-realistic expressive audio-visual speech synthesis for the Greek language"
Python
10
star
4

Kinect2StreamsRecorder

Recording of Kinect V2 Streams at 30 fps.
C#
9
star
5

spgc-eprevention-icassp2024

Baselines and Challenge Information for the e-Prevention ICASSP2024 Grand Challenge
Python
7
star
6

greek-hts-label-creator

Greek Front End for HTS
Python
5
star
7

multimodal-emotion-recognition

Python
4
star
8

WatsonImageCaptioning

Image captioning demo that uses IBM Watson Visual Recognition & Speech Synthesis services. Also includes support for a custom API for extra captions.
JavaScript
3
star
9

ai-conference-keywords

Keyword based statistics for major AI conferences
Python
2
star
10

imusica-avatar

3D Avatar using Kinect backend and three.js for visualization.
JavaScript
2
star
11

ePrevention-scripts

Scripts for the ePrevention project
Python
2
star
12

kaggle-plasticc

Solution for Kaggle PLAsTiCC Astronomical Classification
Python
1
star
13

incremental-learning-CRI

Python
1
star
14

facetracker_pillar

ROS Node and Demo Script for Face Tracking in PILLAR-Robots
Python
1
star
15

iristk-ros-connector

Sample class that connects to iristk broker and sends and receives messages with the iristk protocol. Optionally connect and publish to ROS.
Python
1
star