• Stars
    star
    739
  • Rank 60,916 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created about 1 year ago
  • Updated 10 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Tracking Anything in High Quality

Tracking Anything in High Quality

Technical Report:

👋Welcome everyone to contribute and collaborate on HQTrack repository!

Tracking Anything in High Quality (HQTrack) is a framework for high performance video object tracking and segmentation. It mainly consists of a Video Multi-Object Segmenter (VMOS) and a Mask Refiner (MR), can track multiple target objects at the same time and output accurate object masks.

🍺 HQTrack obtains runner-up in the Visual Object Tracking and Segmentaion (VOTS2023) challenge.

📆TODO

  • Demo (can be run locally).
  • Training codes.
  • Interactive WebUI
  • Lightweight version for computing-friendly.

📢News

  • [2023/8/17] We release the training codes.
  • [2023/7/30] We provide a demo code that can be run locally.
  • [2023/7/22] We author a technical report for HQTrack.
  • [2023/7/3] HQTrack ranks the 2nd place in the VOTS2023 challenge.

🔥Demo

We also provide a demo script, which supports box and point prompts as inputs. This is a pure python script that allows the user to test arbitrary videos.

🐍Pipeline

image

📑Intallation

  • Install the conda environment
conda create -n hqtrack python=3.8
conda activate hqtrack
  • Install Pytorch
conda install pytorch==1.9 torchvision cudatoolkit=10.2 -c pytorch
  • Install HQ-SAM
cd segment_anything_hq
pip install -e .
pip install opencv-python pycocotools matplotlib onnxruntime onnx
  • Install Pytorch-Correlation-extension package
cd packages/Pytorch-Correlation-extension/
python setup.py install
  • Install ops_dcnv3
cd HQTrack/networks/encoders/ops_dcnv3
./make.sh
  • Install vots toolkit
pip install vot-toolkit
  • Install other packages
pip install easydict
pip install lmdb
pip install einops
pip install jpeg4py
pip install 'protobuf~=3.19.0'
conda install setuptools==58.0.4
pip install timm
pip install tb-nightly
pip install tensorboardx
pip install scikit-image
pip install rsa
pip install six
pip install pillow

🚗Run HQTrack

  • Model Preparation

Download VMOS model from Google Driver or Baidu Driver and put it under

/path/to/HQTrack/result/default_InternT_MSDeAOTL_V2/YTB_DAV_VIP/ckpt/

Download HQ-SAM_h and put it under

/path/to/HQTrack/segment_anything_hq/pretrained_model/
  • Initialize the vots workspace
cd /path/to/VOTS23_workspace
vot initialize tests/multiobject
  • Copy our trackers.ini to your vot workspace
cp /path/to/our/trackers.ini /path/to/VOTS23_workspace/trackers.ini
  • Modify your path in trackers.ini
  • test the tracker and pack the results
bash run.sh

🐬Training

Stage 1

In stage 1, we pre-train VMOS on synthetic video sequences generated from static image datasets. We refer readers to AFB-URR for preparing the pre-train datasets. The Static dataset should be put in

/path/to/HQTrack/datasets/
/path/to/HQTrack/pretrain_models/
  • Transfer the backbone ckpt to meet VMOS model
python my_tools/transfer_intern_pretrained_model.py
  • Set the relevant training args in
/path/to/HQTrack/configs/pre.py
  • Start stage 1 pre-training by running:
CUDA_VISIBLE_DEVICES="1" python tools/train.py --amp \
	--exp_name "Static_Pre" \
	--stage "pre" \
	--model "internT_msdeaotl_v2" \
	--gpu_num "1"

Stage 2

In stage 2, video multi-object segmentation datasets are employed for training, e.g., DAVIS and YoutubeVOS.

  • Prepare the datasets and put them under
/path/to/HQTrack/datasets/
  • Start stage 2 training by running:
CUDA_VISIBLE_DEVICES="1" python tools/train.py --amp \
	--exp_name "default" \
	--stage "ytb_vip_dav_deaot_internT" \
	--model "internT_msdeaotl_v2" \
	--gpu_num "1"

You can include more training datasets such as VIPSeg, BURST, MOTS, and OVIS for better performance.

📖 Citation

If you find HQTrack useful for you, please consider citing 📣

@misc{hqtrack,
      title={Tracking Anything in High Quality}, 
      Author = {Jiawen Zhu and Zhenyu Chen and Zeqi Hao and Shijie Chang and Lu Zhang and Dong Wang and Huchuan Lu and Bin Luo and Jun-Yan He and Jin-Peng Lan and Hanyuan Chen and Chenyang Li},
      Title = {Tracking Anything in High Quality},
      Year = {2023},
      Eprint = {arXiv:2307.13974},
      PrimaryClass={cs.CV}
}

♥️ Acknowledgment

This project is based on DeAOT, HQ-SAM, and SAM. Thanks for these excellent works.

📧Contact

If you have any question, feel free to email [email protected]. ^_^