Face Recognition for NVIDIA Jetson (Nano) using TensorRT
Face recognition with Google FaceNet
architecture and retrained model by David Sandberg
(github.com/davidsandberg/facenet)
using TensorRT and OpenCV.
This project is based on the
implementation of l2norm helper functions which are needed in the output
layer of the FaceNet model. Link to the repo:
github.com/r7vme/tensorrt_l2norm_helper.
Moreover, this project uses an adapted version of PKUZHOU's implementation
of the mtCNN for face detection. More info below.
Hardware
- NVIDIA Jetson Nano
- Raspberry Pi v2 camera
If you want to use a USB camera instead of Raspi Camera set the boolean isCSICam to false in main.cpp.
Dependencies
cuda 10.2 + cudnn 8.0
TensorRT 7.x
OpenCV 4.1.1
TensorFlow r1.14 (for Python to convert model from .pb to .uff)
Update
This master branch now uses Jetpack 4.4, so dependencies have slightly changed and tensorflow is not preinstalled anymore. So there is an extra step that takes a few minutes more than before.
In case you would like to use older versions of Jetpack there is a tag jp4.2.2, that can links to the older implementation.
Installation
1. Install Cuda, CudNN, TensorRT, and TensorFlow for Python
You can check NVIDIA website for help.
Installation procedures are very well documented.
If you are
using NVIDIA Jetson (Nano, TX1/2, Xavier) with Jetpack 4.4, most needed packages
should be installed if the Jetson was correctly flashed using SDK
Manager or the SD card image, you will only need to install cmake, openblas and tensorflow:
sudo apt install cmake libopenblas-dev
2. Install Tensorflow
The following shows the steps to install Tensorflow for Jetpack 4.4. This was copied from the official NVIDIA documentation. I'm assuming you don't need to install it in a virtual environment. If yes, please refer to the documentation linked above. If you are not installing this on a jetson, please refer to the official tensorflow documentation.
# Install system packages required by TensorFlow:
sudo apt update
sudo apt install libhdf5-serial-dev hdf5-tools libhdf5-dev zlib1g-dev zip libjpeg8-dev liblapack-dev libblas-dev gfortran
# Install and upgrade pip3
sudo apt install python3-pip
sudo pip3 install -U pip testresources setuptools
# Install the Python package dependencies
sudo pip3 install -U numpy==1.16.1 future==0.18.2 mock==3.0.5 h5py==2.10.0 keras_preprocessing==1.1.1 keras_applications==1.0.8 gast==0.2.2 futures protobuf pybind11
# Install TensorFlow using the pip3 command. This command will install the latest version of TensorFlow compatible with JetPack 4.4.
sudo pip3 install --pre --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v44 'tensorflow<2'
3. Prune and freeze TensorFlow model or get frozen model in the link
The inputs to the original model are an input tensor consisting of a single or multiple faces and a phase train tensor telling all batch normalisation layers that model is not in train mode. Batch normalisation uses a switch layer to decide if the model is currently trained or just used for inference. This switch layer cannot be processed in TensorRT which is why it needs to be removed. Apparently this can be done using freeze_graph from TensorFlow, but here is a link to model where the phase train tensor has already been removed from the saved model github.com/apollo-time/facenet/raw/master/model/resnet/facenet.pb
4. Convert frozen protobuf (.pb) model to UFF
Use the convert-to-uff tool which is installed with tensorflow installation to convert the *.pb model to *.uff. The script will replace unsupported layers with custom layers implemented by github.com/r7vme/tensorrt_l2norm_helper. Please check the file for the user defined values and update them if needed. Do not worry if there are a few warnings about the TRT_L2NORM_HELPER plugin.
cd path/to/project
python3 step01_pb_to_uff.py
You should now have a facenet.uff file in the facenetModels folder which will be used as the input model to TensorRT.
4. Get mtCNN models
This repo uses an implementation by PKUZHOU of the multi-task Cascaded Convolutional Neural Network (mtCNN) for face detection. The original implementation was adapted to return the bounding boxes such that it can be used as input to my FaceNet TensorRT implementation. You will need all models from the repo in the mtCNNModels folder so please do this to download them:
# go to one above project,
cd path/to/project/..
# clone PKUZHOUs repo,
git clone https://github.com/PKUZHOU/MTCNN_FaceDetection_TensorRT
# and move models into mtCNNModels folder
mv MTCNN_FaceDetection_TensorRT/det* path/to/project/mtCNNModels
After doing so you should have the following files in your mtCNNModels folder:
- det1_relu.caffemodel
- det1_relu.prototxt
- det2_relu.caffemodel
- det2_relu.prototxt
- det3_relu.caffemodel
- det3_relu.prototxt
- README.md
Done you are ready to build the project!
5. Build the project
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make -j${nproc}
If not run on Jetson platform set the path to your CUDA and TensorRT installation using -DCUDA_TOOLKIT_ROOTDIR=path/to/cuda and -DTENSORRT_ROOT=path/to/tensorRT.
NOTE
.uff and .engine files are GPU specific, so if you use want to run this project on a different GPU or on another machine, always start over at step 3. above.
Usage
Put images of people in the imgs folder. Please only use images that contain one face.
NEW FEATURE:You can now add faces while the algorithm is running. When you see
the OpenCV GUI, press "N" on your keyboard to add a new face. The camera input will stop until
you have opened your terminal and put in the name of the person you want to add.
./mtcnn_facenet_cpp_tensorRT
Press "Q" to quit and to show the stats (fps).
NOTE: This step might take a while when done the first time. TensorRT now parses and serializes the model from .uff to a runtime engine (.engine file).
Performance
Performance on NVIDIA Jetson Nano
- ~60ms +/- 20ms for face detection using mtCNN
- ~22ms +/- 2ms per face for facenet inference
- Total: ~15fps
Performance on NVIDIA Jetson AGX Xavier:
- ~40ms +/- 20ms for mtCNN
- ~9ms +/- 1ms per face for inference of facenet
- Total: ~22fps
License
Please respect all licenses of OpenCV and the data the machine learning models (mtCNN and Google FaceNet) were trained on.
FAQ
Sometimes the camera driver doesn't close properly that means you will have to restart the nvargus-daemon:
sudo systemctl restart nvargus-daemon
Info
Niclas Wesemann
[email protected]