• Stars
    star
    205
  • Rank 191,264 (Top 4 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 2 years ago
  • Updated 6 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

The second generation of YOWO action detector.

YOWOv2: A Stronger yet Efficient Multi-level Detection Framework for Real-time Spatio-temporal Action Detection

English | 简体中文

Overview of YOWOv2

image

Requirements

  • We recommend you to use Anaconda to create a conda environment:
conda create -n yowo python=3.6
  • Then, activate the environment:
conda activate yowo
  • Requirements:
pip install -r requirements.txt 

Visualization

image image image

image image image

Dataset

UCF101-24:

You can download UCF24 from the following links:

  • Google drive

Link: https://drive.google.com/file/d/1Dwh90pRi7uGkH5qLRjQIFiEmMJrAog5J/view?usp=sharing

  • BaiduYun Disk

Link: https://pan.baidu.com/s/11GZvbV0oAzBhNDVKXsVGKg

Password: hmu6

AVA

You can use instructions from here to prepare AVA dataset.

Experiment

  • UCF101-24
Model Clip GFLOPs Params F-mAP V-mAP FPS Weight
YOWOv2-Nano 16 1.3 3.5 M 78.8 48.0 42 ckpt
YOWOv2-Tiny 16 2.9 10.9 M 80.5 51.3 50 ckpt
YOWOv2-Medium 16 12.0 52.0 M 83.1 50.7 42 ckpt
YOWOv2-Large 16 53.6 109.7 M 85.2 52.0 30 ckpt
YOWOv2-Nano 32 2.0 3.5 M 79.4 49.0 42 ckpt
YOWOv2-Tiny 32 4.5 10.9 M 83.0 51.2 50 ckpt
YOWOv2-Medium 32 12.7 52.0 M 83.7 52.5 40 ckpt
YOWOv2-Large 32 91.9 109.7 M 87.0 52.8 22 ckpt

All FLOPs are measured with a video clip with 16 or 32 frames (224×224). The FPS is measured with batch size 1 on a 3090 GPU from the model inference to the NMS operation.

Qualitative results on UCF101-24 image

  • AVA v2.2
Model Clip mAP FPS weight
YOWOv2-Nano 16 12.6 40 ckpt
YOWOv2-Tiny 16 14.9 49 ckpt
YOWOv2-Medium 16 18.4 41 ckpt
YOWOv2-Large 16 20.2 29 ckpt
YOWOv2-Nano 32 12.7 40 ckpt
YOWOv2-Tiny 32 15.6 49 ckpt
YOWOv2-Medium 32 18.4 40 ckpt
YOWOv2-Large 32 21.7 22 ckpt

Qualitative results on AVA image

Train YOWOv2

  • UCF101-24

For example:

python train.py --cuda -d ucf24 --root path/to/dataset -v yowo_v2_nano --num_workers 4 --eval_epoch 1 --max_epoch 8 --lr_epoch 2 3 4 5 -lr 0.0001 -ldr 0.5 -bs 8 -accu 16 -K 16

or you can just run the script:

sh train_ucf.sh
  • AVA
python train.py --cuda -d ava_v2.2 --root path/to/dataset -v yowo_v2_nano --num_workers 4 --eval_epoch 1 --max_epoch 10 --lr_epoch 3 4 5 6 -lr 0.0001 -ldr 0.5 -bs 8 -accu 16 -K 16 --eval

or you can just run the script:

sh train_ava.sh

If you have multiple GPUs, you can launch DDP to train the YOWOv2, for example:

python train.py --cuda -dist -d ava_v2.2 --root path/to/dataset -v yowo_v2_nano --num_workers 4 --eval_epoch 1 --max_epoch 10 --lr_epoch 3 4 5 6 -lr 0.0001 -ldr 0.5 -bs 8 -accu 16 -K 16 --eval

However, I have not multiple GPUs, so I am not sure if there are any bugs, or if the given performance can be reproduced using DDP.

Test YOWOv2

  • UCF101-24 For example:
python test.py --cuda -d ucf24 -v yowo_v2_nano --weight path/to/weight -size 224 --show
  • AVA For example:
python test.py --cuda -d ava_v2.2 -v yowo_v2_nano --weight path/to/weight -size 224 --show

Test YOWOv2 on AVA video

For example:

python test_video_ava.py --cuda -d ava_v2.2 -v yowo_v2_nano --weight path/to/weight --video path/to/video --show

Note that you can set path/to/video to other videos in your local device, not AVA videos.

Evaluate YOWOv2

  • UCF101-24 For example:
# Frame mAP
python eval.py \
        --cuda \
        -d ucf24 \
        -v yowo_v2_nano \
        -bs 16 \
        -size 224 \
        --weight path/to/weight \
        --cal_frame_mAP \
# Video mAP
python eval.py \
        --cuda \
        -d ucf24 \
        -v yowo_v2_nano \
        -bs 16 \
        -size 224 \
        --weight path/to/weight \
        --cal_video_mAP \
  • AVA

Run the following command to calculate frame [email protected] IoU:

python eval.py \
        --cuda \
        -d ava_v2.2 \
        -v yowo_v2_nano \
        -bs 16 \
        --weight path/to/weight

Demo

# run demo
python demo.py --cuda -d ucf24 -v yowo_v2_nano -size 224 --weight path/to/weight --video path/to/video --show
                      -d ava_v2.2

Qualitative results in real scenarios image

References

If you are using our code, please consider citing our paper.

@article{yang2023yowov2,
  title={YOWOv2: A Stronger yet Efficient Multi-level Detection Framework for Real-time Spatio-temporal Action Detection},
  author={Yang, Jianhua and Kun, Dai},
  journal={arXiv preprint arXiv:2302.06848},
  year={2023}
}

More Repositories

1

new-YOLOv1_PyTorch

Python
261
star
2

yolov2-yolov3_PyTorch

Python
223
star
3

PyTorch_YOLO-Family

Python
159
star
4

PyTorch_YOLOv1

A new version of YOLOv1
Python
153
star
5

RT-ODLab

YOLO Tutorial
Python
141
star
6

CenterNet-Lite

A resnet18 version of CenterNet(objects as points)
Python
124
star
7

PyTorch_YOWO

Python
93
star
8

FreeYOLO

Python
91
star
9

CenterNet-plus

A Simple Baseline for Object Detection
Python
55
star
10

FCOS-RT_PyTorch

A real-time version of FCOS, inspired by FCOSv2.
Python
48
star
11

PyTorch_YOLOF

A PyTorch version of You Only Look at One-level Feature object detector
Python
36
star
12

YOLAF

You Only Look At Face
Python
35
star
13

PyTorch_DCNv2

A simple version of Deformable Convolution Network V2
Python
34
star
14

YOLO-Nano

A new version YOLO-Nano
Python
28
star
15

YOWOF

You Only Watch One Frame for Online Spatio-Temporal Action Detection
Python
27
star
16

PyTorch_YOLOv2

Python
26
star
17

pytorch-imagenet

Python
23
star
18

DetLAB

Python
19
star
19

YOLO-Tutorial-v2

Python
17
star
20

image_classification_pytorch

Python
15
star
21

PyTorch_YOLOv3

Python
14
star
22

AVA_Dataset

download AVA dataset
Shell
13
star
23

YOLOX-Backbone

The backbone CSPDarkNet of YOLOX.
Python
12
star
24

SAMI

Masked AutoEncoders leveraging Segment-Anything
Python
12
star
25

DeTR-Lite

A simple version of DeTR
Python
11
star
26

DeTR-LAB

Library of Detection with Transformer
Python
11
star
27

NeuralNetwork

Python
10
star
28

ViT-Lite

A Lite version of VisTransformer
Python
10
star
29

ODLab-World

Python
9
star
30

ODLab

General Object Detection
Python
9
star
31

YOLOF-Lite

A pytorch version of YOLOF
Python
8
star
32

OurDetection

这是一个指导初学者如何在自己的训练集上进行训练的项目
Python
8
star
33

MAE

PyTorch implementation of Masked AutoEncoder
Python
7
star
34

Vision-Pretraining-Tutorial

Python
7
star
35

FreeYOLOv2

Python
5
star
36

PyTorch_YOLOv4

Python
5
star
37

DiscreteCosineTransformation

A numpy & pytorch deployment of 2D DiscreteCosineTransformation (DCT)
Python
4
star
38

CIFAR_PyTorch

This is a very prime deep learning project.
Python
4
star
39

PyTorch_AnchorYOLO

Python
3
star
40

FreeTrack

Python
3
star
41

CSPDarkNet53

CSPDarkNet53
Python
3
star
42

aigc_tutorial

Python
3
star
43

PyTorch_FCOS

A PyTorch version of RetinaNet
Python
2
star
44

KonFaceDetection

I love HTT!
Python
2
star
45

Combine-and-Conquer-Detection

Python
2
star
46

ThunderNet-Backbone

Attention, I just supply the backbone of thundernet, not the whole pipeline of thundernet.
Python
2
star
47

OpenVINO-Python-FreeYOLO

Python
2
star
48

OpenVINO-CPP-FreeYOLO

C++
1
star
49

ONNX-FreeYOLO

Python
1
star
50

AIM

Autoregressive Image Modeling
Python
1
star
51

E2E_FCOS

End-to-End Fully Convolutional One-Stage Object Detector
Python
1
star
52

SAM_demo

Python
1
star