• Stars
    star
    615
  • Rank 72,947 (Top 2 %)
  • Language
    C++
  • License
    MIT License
  • Created over 5 years ago
  • Updated over 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

[ICCV2019 Oral] Photo-Realistic Facial Details Synthesis from Single Image

This is the code repo for Facial Details Synthesis From Single Input Image. [Paper] [Supplemental Material] [Video]

This repository consists of 5 individual parts: DFDN, emotionNet, landmarkDetector, proxyEstimator and faceRender.

  • DFDN is used to estimate displacement map, and its network architecture is based on junyanz's pix2pix
  • For landmarkDetector and FACS-based expression detector (you can choose between this and emotionNet), we use a simplified version of openFace
  • proxyEstimator is used to generate proxy mesh using expression/emotion prior. It is modified based on patrikhuber's fantastic work eos
  • faceRender is used for interactive rendering

We would like to thank each of the related projects for their great work.

Facial Details Synthesis

Anpei Chen, Zhang Chen, Guli Zhang, Ziheng Zhang, Kenny Mitchell, Jingyi Yu

We present a single-image 3D face synthesis technique that can handle challenging facial expressions while recovering fine geometric details. Our technique employs expression analysis for proxy face geometry generation and combines supervised and unsupervised learning for facial detail synthesis. On proxy generation, we conduct emotion prediction to determine a new expression-informed proxy. On detail synthesis, we present a Deep Facial Detail Net (DFDN) based on Conditional Generative Adversarial Net (CGAN) that employs both geometry and appearance loss functions. For geometry, we capture 366 high-quality 3D scans from 122 different subjects under 3 facial expressions. For appearance, we use additional 163K in-the-wild face images and apply image-based rendering to accommodate lighting variations. Comprehensive experiments demonstrate that our framework can produce high-quality 3D faces with realistic details under challenging facial expressions.

Features

  • Functionality
    • Proxy estimation with expression/emotion prior
    • Facial details prediction, i.e. winkles
    • Renderer for results (proxy mesh + normalMap/displacementMap)
  • Input: Single image or image folder
  • Output: Proxy mesh & texture, detailed displacementMap and normalMap
  • OS: Windows 10

Set up environment

  1. Install windows version of Anaconda Python3.7 and pytorch
  2. [Optional] Install tensorflow and keras if you want to use emotion prior (emotionNet)

Released version

  1. Download the released package.

    Release v0.1.0 [Google Drive, OneDrive]

  2. Download models and pre-trained weights.

    DFDN checkpoints [Google Drive, OneDrive] unzip to ./DFDN/checkpoints

    landmark models [Google Drive, OneDrive] unzip to ./landmarkDetector

    [Optional] emotionNet checkpoints [Google Drive, OneDrive] unzip to ./emotionNet/checkpoints

  3. Install BFM2017

    • Install eos-py by pip install --force-reinstall eos-py==0.16.1

    • Download BFM2017 and copy model2017-1_bfm_nomouth.h5 to ./proxyEstimator/bfm2017/

    • Run python convert-bfm2017-to-eos.py to generate bfm2017-1_bfm_nomouth.bin in ./proxyEstimator/bfm2017/ folder

  4. Have fun!

Usage

  • For proxy estimation,

    python proxyPredictor.py -i path/to/input/image -o path/to/output/folder [--FAC 1][--emotion 1]
    
    • For batch processing, you can set -i to a image folder.

    • For prior features, you can optional choose one of those two priors: FACS-based expression prior, --FAC 1, emotion prior, --emotion 1.

    example: python proxyPredictor.py -i ./samples/proxy -o ./results

  • For facial details estimation,

    python facialDetails.py -i path/to/input/image -o path/to/output/folder
    

    example:

    python facialDetails.py -i ./samples/details/019615.jpg -o ./results

    python facialDetails.py -i ./samples/details -o ./results

  • note: we highly suggest you crop input image to a square size.

Compiling

We suggest you directly download the released package for convenience. If you are interested in compiling the source code, please go through the following guidelines.

  1. First, clone the source code,

    git clone https://github.com/apchenstu/Facial_Details_Synthesis.git --recursive

  2. cd to the root of each individual model then start compiling,

    landmarkDetector

    • Executing the download_libraries.ps1 and download_models.ps1 with PowerShell script.

    • Open OpenFace.sln using Visual Studio and compile the code.

      After compiling, the excuse file would located in /x64/Release/FaceLandmarkImg.exe

    textureRender

    • install with

       mkdir build && cd build
       cmake -A X64 -D CMAKE_PREFIX_PATH=../thirds ../src
      
    • Open textureRender.sln using Visual Studio and compile the code.

      After compiling, the excuse file would located in Release/textureRender.exe

    proxyEstimator

    • install vcpkg

    • install package under vcpkg folder: ./vcpkg install opencv boost --triplet x64-windows

    • Install with,

      mkdir build && cd build
      cmake .. -A X64 -DCMAKE_TOOLCHAIN_FILE=[vcpkg root]\scripts\buildsystems\vcpkg.cmake
      
    • Open eos.sln using Visual Studio and compile the code.

      After compiling, the excuse file would located in Release/eso.exe

      For more details, please refer to this repo.

    faceRender

    • Install with

       mkdir build && cd build
       cmake -A X64 -D CMAKE_PREFIX_PATH=../thirds ../src
      
    • Open hmrenderer.sln using Visual Studio and compile the code.

      After compiling, the excuse file would located in build\Release

      Note: The visualizer currently only supports mesh + normalMap, but will also support displacementMap in the near future.

    After compiling, please download DFDN checkpoints, unzip to ./DFDN/checkpoints. Then you are free to use.

Others

On the way .....

Q & A

  1. Proxy result is different with showing in the paper?

    It's because the released version are using a lower resolution input and a different expression dictionary, which are more robust in general case. Please try this if you want to obtain similar results as in the paper.

Citation

If you find this code useful to your research, please consider citing:

@inproceedings{chen2019photo,
  title={Photo-Realistic Facial Details Synthesis from Single Image},
  author={Chen, Anpei and Chen, Zhang and Zhang, Guli and Mitchell, Kenny and Yu, Jingyi},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={9429--9439},
  year={2019}
}