Yolo DeepStream
Description
This repo have 4 parts:
1) yolov7_qat
In yolov7_qat, We use TensorRT's pytorch quntization tool to Finetune training QAT yolov7 from the pre-trained weight. Finally we get the same performance of PTQ in TensorRT on Jetson OrinX. And the accuracy(mAP) of the model only dropped a little.
2) tensorrt_yolov7
In tensorrt_yolov7, We provide a standalone c++ yolov7-app sample here. You can use trtexec to convert FP32 onnx models or QAT-int8 models exported from repo yolov7_qat to trt-engines. And set the trt-engine as yolov7-app's input. It can do detections on images/videos. Or test mAP on COCO dataset.
3) deepstream_yolo
In deepstream_yolo, This sample shows how to integrate YOLO models with customized output layer parsing for detected objects with DeepStreamSDK.
4) tensorrt_yolov4
In tensorrt_yolov4, This sample shows a standalone tensorrt-sample for yolov4.
Performance
For YoloV7 sample:
Below table shows the end-to-end performance of processing 1080p videos with this sample application.
-
Testing Device :
-
Jetson AGX Orin 64GB(PowerMode:MAXN + GPU-freq:1.3GHz + CPU:12-core-2.2GHz)
-
Tesla T4
-
Device | precision | Number of streams |
Batch Size | trtexec FPS | deepstream-app FPS with cuda-post-process |
deepstream-app FPS with cpu-post-process |
---|---|---|---|---|---|---|
Orin-X | FP16 | 1 | 1 | 126 | 124 | 120 |
Orin-X | FP16 | 16 | 16 | 162 | 145 | 135 |
Orin-X | Int8(PTQ/QAT) | 1 | 1 | 180 | 175 | 128 |
Orin-X | Int8(PTQ/QAT) | 16 | 16 | 264 | 264 | 135 |
T4 | FP16 | 1 | 1 | 132 | 125 | 123 |
T4 | FP16 | 16 | 16 | 169 | 169 | 123 |
T4 | Int8(PTQ/QAT) | 1 | 1 | 208 | 170 | 127 |
T4 | Int8(PTQ/QAT) | 16 | 16 | 305 | 300 | 132 |
- note: trtexec cudaGraph not enabled as deepstream not support cudaGraph
Code structure
โโโ deepstream_yolo
โ โโโ config_infer_primary_yoloV4.txt # config file for yolov4 model
โ โโโ config_infer_primary_yoloV7.txt # config file for yolov7 model
โ โโโ deepstream_app_config_yolo.txt # deepStream reference app configuration file for using YOLOv models as the primary detector.
โ โโโ labels.txt # labels for coco detection # output layer parsing function for detected objects for the Yolo model.
โ โโโ nvdsinfer_custom_impl_Yolo
โ โ โโโ Makefile
โ โ โโโ nvdsparsebbox_Yolo.cpp
โ โโโ README.md
โโโ README.md
โโโ tensorrt_yolov4
โ โโโ data
โ โ โโโ demo.jpg # the demo image
โ โ โโโ demo_out.jpg # image detection output of the demo image
โ โโโ Makefile
โ โโโ Makefile.config
โ โโโ README.md
โ โโโ source
โ โโโ generate_coco_image_list.py # python script to get list of image names from MS COCO annotation or information file
โ โโโ main.cpp # program main entrance where parameters are configured here
โ โโโ Makefile
โ โโโ onnx_add_nms_plugin.py # python script to add BatchedNMSPlugin node into ONNX model
โ โโโ SampleYolo.cpp # yolov4 inference class functions definition file
โ โโโ SampleYolo.hpp # yolov4 inference class definition file
โโโ tensorrt_yolov7
โ โโโ CMakeLists.txt
โ โโโ imgs # the demo images
โ โ โโโ horses.jpg
โ โ โโโ zidane.jpg
โ โโโ README.md
โ โโโ samples
โ โ โโโ detect.cpp # detection app for images detection
โ โ โโโ validate_coco.cpp # validate coco dataset app
โ โ โโโ video_detect.cpp # detection app for video detection
โ โโโ src
โ โ โโโ argsParser.cpp # argsParser helper class for commandline parsing
โ โ โโโ argsParser.h # argsParser helper class for commandline parsing
โ โ โโโ tools.h # helper function for yolov7 class
โ โ โโโ Yolov7.cpp # Class Yolov7
โ โ โโโ Yolov7.h # Class Yolov7
โ โโโ test_coco_map.py # tool for test coco map with json file
โโโ yolov7_qat
โโโ doc
โ โโโ Guidance_of_QAT_performance_optimization.md # guidance for Q&DQ insert and placement for pytorch-quantization tool
โโโ quantization
โ โโโ quantize.py # helper class for quantize yolov7 model
โ โโโ rules.py # rules for Q&DQ nodes insert and restrictions
โโโ README.md
โโโ scripts
โโโ detect-trt.py # detect a image with tensorrt engine
โโโ draw-engine.py # draw tensorrt engine to graph
โโโ eval-trt.py # the script for evalating tensorrt mAP
โโโ eval-trt.sh # the command lne script for evaluating tensorrt mAP
โโโ qat.py # main function for QAT and PTQ
โโโ trt-int8.py # tensorrt build-in calibration