Instances as Queries
-
[News]
Apr, 2022
: If you likeQueryInst
for instance segmentation, you might also likeTeViT
(CVPR 2022, oral, paper / code & models) for high-performance video instance segmentation!.Oct, 2021
:QueryInst (ICCV 2021)
is now officially included bymmdetection
library, with new checkpoints, corresponding logs, and augmented training settings. We suggest you to use the newestQueryInst
implementation inmmdetection
, meanwhile this repo will be maintained too. Issues are welcomed if you have problems usingQueryInst
to reproduce the COCO AP reported in our paper.
-
TL;DR: QueryInst (Instances as Queries) is a simple and effective query based instance segmentation method driven by parallel supervision on dynamic mask heads, which outperforms previous arts in terms of both accuracy and speed.
-
Our QueryTrack (i.e., Tracking Instances as Queries, tech report) based on QueryInst won the 2nd place
(AP = 52.3 @ test set, AP = 54.3 @ val set)
in video instance segmentation (VIS) track with single online end-to-end model, single scale testing & without using extra video training data in the 3rd Large-scale Video Object Segmentation Challenge, CVPR 2021. -
For the first time, we demonstrate that an end-to-end query based framework driven by parallel supervision is competitive with well-established and highly-optimized methods in a wide range of instance-level recognition tasks (object detection, instance segmentation and video instance segmentation).
by Yuxin Fang*, Shusheng Yang*, Xinggang Wangโ , Yu Li, Chen Fang, Ying Shan, Bin Feng, Wenyu Liu.
(*) equal contribution, (โ ) corresponding author.
-
This repo serves as the official implementation for QueryInst, based on mmdetection and built upon Sparse R-CNN & DETR. Implantations based on Detectron2 will be released in the near future.
-
This project is under active development, we will extend QueryInst to a wide range of instance-level recognition tasks.
Main Results on COCO test-dev
Configs | Aug. | Weights | Box AP | Mask AP |
---|---|---|---|---|
QueryInst_Swin_L_300_queries (single scale testing) | 400 ~ 1200, w/ Crop | baidu / google | 56.1 | 49.1 |
Main Results on COCO val
Configs | Aug. | Weights | Box AP | Mask AP |
---|---|---|---|---|
QueryInst_R50_3x_300_queries | 480 ~ 800, w/ Crop | baidu / google | 46.9 | 41.4 |
QueryInst_R101_3x_300_queries | 480 ~ 800, w/ Crop | baidu / google | 48.0 | 42.4 |
QueryInst_X101-DCN_3x_300_queries | 480 ~ 800, w/ Crop | - | 50.3 | 44.2 |
QueryInst_Swin_L_300_queries (single scale testing) | 400 ~ 1200, w/ Crop | baidu / google | 56.1 | 48.9 |
Notes:
- Accesscode for
baidu
isQIst
.
Getting Started
- Our project is mainly developed on mmdetection toolbox
(931d96)
, please refer to the mmdetection official installation. - Install
QueryInst
by:
python setup.py develop
- Prepare datasets:
mkdir data && cd data
ln -s /path/to/coco coco
- Training QueryInst with single GPU:
python tools/train.py configs/queryinst/queryinst_r50_fpn_1x_coco.py
- Training QueryInst with multi GPUs:
./tools/dist_train.sh configs/queryinst/queryinst_r50_fpn_1x_coco.py 8
- Test QueryInst on COCO val set with single GPU:
python tools/test.py configs/queryinst/queryinst_r50_fpn_1x_coco.py PATH/TO/CKPT.pth --eval bbox segm
- Test QueryInst on COCO val set with multi GPUs:
./tools/dist_test.sh configs/queryinst/queryinst_r50_fpn_1x_coco.py PATH/TO/CKPT.pth 8 --eval bbox segm
Citation
If you find our paper and code useful in your research, please consider giving a star
@InProceedings{Fang_2021_ICCV,
author = {Fang, Yuxin and Yang, Shusheng and Wang, Xinggang and Li, Yu and Fang, Chen and Shan, Ying and Feng, Bin and Liu, Wenyu},
title = {Instances As Queries},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2021},
pages = {6910-6919}
}
@article{QueryTrack,
title={Tracking Instances as Queries},
author={Yang, Shusheng and Fang, Yuxin and Wang, Xinggang and Li, Yu and Shan, Ying and Feng, Bin and Liu, Wenyu},
journal={arXiv preprint arXiv:2106.11963},
year={2021}
}
TODO
- QueryInst training and inference code.
- QueryInst with Swin-Transformer and Test-Time-Augmentation.
- QueryInst configurations for Cityscapes and YouTube-VIS.
- QueryInst pretrain weights.