• Stars
    star
    650
  • Rank 69,267 (Top 2 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 1 year ago
  • Updated 10 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

[ICCV 2023] Official implementation of the paper "A Simple Framework for Open-Vocabulary Segmentation and Detection"

OpenSeeD

PWC PWC PWC PWC

This is the official implementation of the paper "A Simple Framework for Open-Vocabulary Segmentation and Detection".

openseed_9.4m.mp4

You can also find the more detailed demo at video link on Youtube.

๐Ÿ‘‰ [New] demo code is available ๐Ÿ‘‰ [New] OpenSeeD has been accepted to ICCV 2023! training code is available!

๐Ÿš€ Key Features

  • A Simple Framework for Open-Vocabulary Segmentation and Detection.
  • Support interactive segmentation with box input to generate mask.

๐Ÿ’ก Installation

pip3 install torch==1.13.1 torchvision==0.14.1 --extra-index-url https://download.pytorch.org/whl/cu113
python -m pip install 'git+https://github.com/MaureenZOU/detectron2-xyz.git'
pip install git+https://github.com/cocodataset/panopticapi.git
python -m pip install -r requirements.txt
export DATASET=/pth/to/dataset

Download the pretrained checkpoint from here.

๐Ÿ’ก Demo script

python demo/demo_panoseg.py evaluate --conf_files configs/openseed/openseed_swint_lang.yaml  --image_path images/your_image.jpg --overrides WEIGHT /path/to/ckpt/model_state_dict_swint_51.2ap.pt

๐Ÿ”ฅ Remember to modify the vocabulary thing_classes and stuff_classes in demo_panoseg.py if your want to segment open-vocabulary objects.

Evaluation on coco

python train_net.py --original_load --eval_only --num-gpus 8 --config-file configs/openseed/openseed_swint_lang.yaml MODEL.WEIGHTS=[/path/to/lang/weight](https://github.com/IDEA-Research/OpenSeeD/releases/download/openseed/model_state_dict_swint_51.2ap.pt)

You are expected to get 55.4 PQ.

Training OpenSeeD baseline

Training on coco

python train_net.py --num-gpus 8 --config-file configs/openseed/openseed_swint_lang.yaml --lang_weight [/path/to/lang/weight](https://github.com/IDEA-Research/OpenSeeD/releases/download/training/model_state_dict_only_language.pt)

Training on coco+o365

python train_net.py --num-gpus 8 --config-file configs/openseed/openseed_swint_lang_o365.yaml --lang_weight [/path/to/lang/weight](https://github.com/IDEA-Research/OpenSeeD/releases/download/training/model_state_dict_only_language.pt)

hero_figure

๐Ÿฆ„ Model Framework

hero_figure

๐ŸŒ‹ Results

Results on open segmentation hero_figure Results on task transfer and segmentation in the wild hero_figure

Citing OpenSeeD

If you find our work helpful for your research, please consider citing the following BibTeX entry.

@article{zhang2023simple,
  title={A Simple Framework for Open-Vocabulary Segmentation and Detection},
  author={Zhang, Hao and Li, Feng and Zou, Xueyan and Liu, Shilong and Li, Chunyuan and Gao, Jianfeng and Yang, Jianwei and Zhang, Lei},
  journal={arXiv preprint arXiv:2303.08131},
  year={2023}
}

More Repositories

1

Grounded-Segment-Anything

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
Jupyter Notebook
14,724
star
2

GroundingDINO

[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
Python
6,003
star
3

DINO

[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"
Python
2,160
star
4

T-Rex

[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
Python
2,147
star
5

DWPose

"Effective Whole-body Pose Estimation with Two-stages Distillation" (ICCV 2023, CV4Metaverse Workshop)
Python
2,136
star
6

detrex

detrex is a research platform for DETR-based object detection, segmentation, pose estimation and other visual recognition tasks.
Python
2,001
star
7

awesome-detection-transformer

Collect some papers about transformer for detection and segmentation. Awesome Detection Transformer for Computer Vision (CV)
1,261
star
8

MaskDINO

[CVPR 2023] Official implementation of the paper "Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation"
Python
1,149
star
9

Grounding-DINO-1.5-API

API for Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series
Python
680
star
10

Motion-X

[NeurIPS 2023] Official implementation of the paper "Motion-X: A Large-scale 3D Expressive Whole-body Human Motion Dataset"
Python
542
star
11

DN-DETR

[CVPR 2022 Oral] Official implementation of DN-DETR
Python
535
star
12

DAB-DETR

[ICLR 2022] Official implementation of the paper "DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR"
Jupyter Notebook
499
star
13

OSX

[CVPR 2023] Official implementation of the paper "One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer"
Python
291
star
14

HumanTOMATO

[ICML 2024] ๐Ÿ…HumanTOMATO: Text-aligned Whole-body Motion Generation
Python
276
star
15

MotionLLM

[Arxiv-2024] MotionLLM: Understanding Human Behaviors from Human Motions and Videos
Python
226
star
16

deepdataspace

The Go-To Choice for CV Data Visualization, Annotation, and Model Analysis.
TypeScript
212
star
17

Stable-DINO

[ICCV 2023] Official implementation of the paper "Detection Transformer with Stable Matching"
Python
203
star
18

Lite-DETR

[CVPR 2023] Official implementation of the paper "Lite DETR : An Interleaved Multi-Scale Encoder for Efficient DETR"
Python
182
star
19

DreamWaltz

[NeurIPS 2023] Official implementation of the paper "DreamWaltz: Make a Scene with Complex 3D Animatable Avatars".
Python
176
star
20

MP-Former

[CVPR 2023] Official implementation of the paper: MP-Former: Mask-Piloted Transformer for Image Segmentation
Python
99
star
21

HumanSD

The official implementation of paper "HumanSD: A Native Skeleton-Guided Diffusion Model for Human Image Generation"
Python
92
star
22

HumanArt

The official implementation of CVPR 2023 paper "Human-Art: A Versatile Human-Centric Dataset Bridging Natural and Artificial Scenes"
86
star
23

ED-Pose

The official repo for [ICLR'23] "Explicit Box Detection Unifies End-to-End Multi-Person Pose Estimation "
Python
73
star
24

DQ-DETR

[AAAI 2023] DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and Grounding
54
star
25

DisCo-CLIP

Official PyTorch implementation of the paper "DisCo-CLIP: A Distributed Contrastive Loss for Memory Efficient CLIP Training".
Python
47
star
26

LipsFormer

Python
34
star
27

DiffHOI

Official implementation of the paper "Boosting Human-Object Interaction Detection with Text-to-Image Diffusion Model"
Python
29
star
28

hana

Implementation and checkpoints of Imagen, Google's text-to-image synthesis neural network, in Pytorch
Python
17
star
29

TOSS

[ICLR 2024] Official implementation of the paper "Toss: High-quality text-guided novel view synthesis from a single image"
Python
15
star
30

IYFC

C++
9
star
31

TAPTR

6
star
32

detrex-storage

2
star