• Stars
    star
    7
  • Rank 2,294,772 (Top 46 %)
  • Language
    Python
  • Created 4 months ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

[TPAMI 2024] Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding

More Repositories

1

VideoMAE

[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Python
1,295
star
2

MixFormer

[CVPR 2022 Oral & TPAMI 2024] MixFormer: End-to-End Tracking with Iterative Mixed Attention
Python
445
star
3

TDN

[CVPR 2021] TDN: Temporal Difference Networks for Efficient Action Recognition
Python
366
star
4

EMA-VFI

[CVPR 2023] Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolatio
Python
339
star
5

SparseBEV

[ICCV 2023] SparseBEV: High-Performance Sparse 3D Object Detection from Multi-Camera Videos
Python
328
star
6

MOC-Detector

[ECCV 2020] Actions as Moving Points
Python
264
star
7

AdaMixer

[CVPR 2022 Oral] AdaMixer: A Fast-Converging Query-Based Object Detector
Jupyter Notebook
236
star
8

CamLiFlow

[CVPR 2022 Oral & TPAMI 2023] Learning Optical Flow and Scene Flow with Bidirectional Camera-LiDAR Fusion
Python
216
star
9

SparseOcc

[ECCV 2024] Fully Sparse 3D Occupancy Prediction & RayIoU Evaluation Metric
Python
199
star
10

MeMOTR

[ICCV 2023] MeMOTR: Long-Term Memory-Augmented Transformer for Multi-Object Tracking
Python
141
star
11

MixFormerV2

[NeurIPS 2023] MixFormerV2: Efficient Fully Transformer Tracking
Python
136
star
12

SportsMOT

[ICCV 2023] SportsMOT: A Large Multi-Object Tracking Dataset in Multiple Sports Scenes
Python
133
star
13

SADRNet

[TIP 2021] SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction
Python
126
star
14

MultiSports

[ICCV 2021] MultiSports: A Multi-Person Video Dataset of Spatio-Temporally Localized Sports Actions
Python
106
star
15

FCOT

[CVIU] Fully Convolutional Online Tracking
Python
91
star
16

MMN

[AAAI 2022] Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding
Python
88
star
17

RTD-Action

[ICCV 2021] Relaxed Transformer Decoders for Direct Action Proposal Generation
Python
86
star
18

MOTIP

Multiple Object Tracking as ID Prediction
Python
84
star
19

BCN

[ECCV 2020] Boundary-Aware Cascade Networks for Temporal Action Segmentation
Python
84
star
20

LinK

[CVPR 2023] LinK: Linear Kernel for LiDAR-based 3D Perception
Python
81
star
21

MixSort

[ICCV2023] MixSort: The Customized Tracker in SportsMOT
Python
69
star
22

CPD-Video

Learning Spatiotemporal Features via Video and Text Pair Discrimination
Python
60
star
23

SGM-VFI

[CVPR 2024] Sparse Global Matching for Video Frame Interpolation with Large Motion
Python
59
star
24

Structured-Sparse-RCNN

[CVPR 2022] Structured Sparse R-CNN for Direct Scene Graph Generation
Jupyter Notebook
57
star
25

TRACE

[ICCV 2021] Target Adaptive Context Aggregation for Video Scene Graph Generation
Python
57
star
26

CRCNN-Action

Context-aware RCNN: a Baseline for Action Detection in Videos
Python
53
star
27

STMixer

[CVPR 2023] STMixer: A One-Stage Sparse Action Detector
Python
49
star
28

BasicTAD

BasicTAD: an Astounding RGB-Only Baselinefor Temporal Action Detection
Python
48
star
29

DDM

[CVPR 2022] Progressive Attention on Multi-Level Dense Difference Maps for Generic Event Boundary Detection
Python
48
star
30

VideoMAE-Action-Detection

[NeurIPS 2022 Spotlight] VideoMAE for Action Detection
Python
47
star
31

MGSampler

[ICCV 2021] MGSampler: An Explainable Sampling Strategy for Video Action Recognition
Python
46
star
32

FSL-Video

[BMVC 2021] A Closer Look at Few-Shot Video Classification: A New Baseline and Benchmark
Python
39
star
33

BIVDiff

[CVPR 2024] BIVDiff: A Training-free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models
Python
39
star
34

PointTAD

[NeurIPS 2022] PointTAD: Multi-Label Temporal Action Detection with Learnable Query Points
Python
37
star
35

TemporalPerceiver

[T-PAMI 2023] Temporal Perceiver: A General Architecture for Arbitrary Boundary Detection
Python
34
star
36

TIA

[CVPR 2022] Task-specific Inconsistency Alignment for Domain Adaptive Object Detection
Python
33
star
37

CoMAE

[AAAI 2023] CoMAE: Single Model Hybrid Pre-training on Small-Scale RGB-D Datasets
Python
31
star
38

PDPP

[CVPR 2023 Hightlight] PDPP: Projected Diffusion for Procedure Planning in Instructional Videos
Python
27
star
39

JoMoLD

[ECCV 2022] Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing
Python
27
star
40

EVAD

[ICCV 2023] Efficient Video Action Detection with Token Dropout and Context Refinement
Python
24
star
41

CGA-Net

[CVPR 2021] CGA-Net: Category Guided Aggregation for Point Cloud Semantic Segmentation
Python
23
star
42

SSD-LT

[ICCV 2021] Self Supervision to Distillation for Long-Tailed Visual Recognition
Python
22
star
43

TREG

Target Transformed Regression for Accurate Tracking
Python
21
star
44

VFIMamba

VFIMamba: Video Frame Interpolation with State Space Models
Python
21
star
45

DEQDet

[ICCV 2023] Deep Equilibrium Object Detection
Jupyter Notebook
20
star
46

MGMAE

[ICCV 2023] MGMAE: Motion Guided Masking for Video Masked Autoencoding
Python
19
star
47

OCSampler

[CVPR 2022] OCSampler: Compressing Videos to One Clip with Single-step Sampling
Python
17
star
48

SportsHHI

[CVPR 2024] SportsHHI: A Dataset for Human-Human Interaction Detection in Sports Videos
Python
11
star
49

APP-Net

[TIP] APP-Net: Auxiliary-point-based Push and Pull Operations for Efficient Point Cloud Recognition
Python
11
star
50

AMD

[CVPR 2024] Asymmetric Masked Distillation for Pre-Training Small Foundation Models
Python
11
star
51

StageInteractor

[ICCV 2023] StageInteractor: Query-based Object Detector with Cross-stage Interaction
Python
9
star
52

SPLAM

[ECCV 2024 Oral] SPLAM: Accelerating Image Generation with Sub-path Linear Approximation Model
Python
9
star
53

CMPT

[IJCV 2021] Cross-Modal Pyramid Translation for RGB-D Scene Recognition
Python
8
star
54

VLG

VLG: General Video Recognition with Web Textual Knowledge (https://arxiv.org/abs/2212.01638)
Python
8
star
55

DGN

[IJCV 2023] Dual Graph Networks for Pose Estimation in Crowded Scenes
Python
7
star
56

BFRNet

Python
6
star
57

ViT-TAD

[CVPR 2024] Adapting Short-Term Transformers for Action Detection in Untrimmed Videos
Python
6
star
58

VideoEval

VideoEval: Comprehensive Benchmark Suite for Low-Cost Evaluation of Video Foundation Model
Python
6
star
59

ZeroI2V

[ECCV 2024] ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video
Python
5
star
60

PRVG

[CVIU 2024] End-to-end dense video grounding via parallel regression
Python
5
star
61

LogN

[IJCV 2024] Logit Normalization for Long-Tail Object Detection
Python
4
star