Segment Anything has led to a new breakthrough in the field of Computer Vision (CV), and this repository will continue to track and summarize the research progress of Segment Anything in various fields, including Papers/Projects, etc.
If you find this repository helpful, please consider Stars ⭐ or Sharing ⬆️. Thanks.
- 2023.8.29: Update some recent works.
- 2023.5.20: Update document structure and add a robotic-related article. Happy 520 Day!
- 2023.5.4: Add SEEM.
- 2023.4.18: Add two nice job Inpainting Anything and SAM-Track.
- 2023.4.12: Add some presentations.
- 2023.4.12: An initial version of recent papers or projects.
Title |
Presentation |
Paper page |
Project page |
Code base |
Affiliation |
Description |
CLIP |
|
arXiv |
Colab |
Code |
OpenAI |
Contrastive Language-Image Pre-Training. |
OWL-ViT |
|
ECCV2022 |
- |
Code |
Google |
A open-vocabulary object detector. |
OvSeg |
|
CVPR2023 |
Project |
Code |
META |
Segment an image into semantic regions according to text descriptions. |
Painter |
|
CVPR2023 |
- |
Code |
BAAI |
A Generalist Painter for In-Context Visual Learning. |
Grounding DINO |
|
arXiv |
Colab &Huggingface |
Code |
IDEA |
A stronger open-set object detector |
Segment Anything |
|
arXiv |
Project page |
Code |
Meta |
A stronger Large model which can be used to generate masks for all objects in an image. |
SegGPT |
|
arXiv |
Project page |
Code |
BAAI |
Segmenting Everything In Context based on Painter. |
Segment Everything Everywhere All at Once (SEEM) |
|
arXiv |
Project Page |
Code |
Microsoft |
Semantic Segmentation with various prompt types. |
Title |
Presentation |
Paper page |
Project page |
Code base |
Affiliation |
Description |
CLIP_Surgery |
|
arXiv |
Demo |
Code |
HKUST |
This work about SAM based on CLIP's explainability to achieve text to mask without manual points. |
Segment Anything Is Not Always Perfect |
|
arXiv |
- |
- |
Samsung |
This paper analyzes and discusses the benefits and limitations of SAM. |
PerSAM |
|
arXiv |
Project Page |
Code |
- |
Segment Anything with specific concepts. |
Matcher: Segment Anything with One Shot Using All-Purpose Feature Matching |
|
arXiv |
- |
Code |
- |
One shot semantic segmentation by integrating an all-purpose feature extraction model and a class-agnostic segmentation model. |
Segment Anything in High Quality |
|
arXiv |
Project Page |
- |
ETH Zürich & HKUST |
HQ-SAM: improve segmentation quality of SAM using learnable High-Quality Output Token. |
Detect Any Shadow: Segment Anything for Video Shadow Detection |
|
arXiv |
- |
Code |
University of Science and Technology of China |
Use SAM to detect initial frames then use an LSTM network for subsequent frames. |
Fast Segment Anything |
|
arXiv |
Project Page |
Code |
- |
Reformulate the architecture and improve the speed of SAM. |
MobileSAM (Faster Segment Anything) |
|
arXiv |
Project Page |
Code |
Kyung Hee University |
make SAM mobile-friendly by replacing the heavyweight image encoder with a lightweight one. |
FoodSAM (Any Food Segmentation) |
|
arxiv |
Project Page |
Code |
UCAS |
semantic, instance, panoptic, interactive segmentation on food image. |
Title |
Presentation |
Paper page |
Project page |
Code base |
Affiliation |
Description |
Segment Anything Model (SAM) for Digital Pathology |
|
arXiv |
- |
- |
- |
SAM + Tumor segmentation/Tissue segmentation/Cell nuclei segmentation. |
Segment Anything in Medical Images |
|
arXiv |
- |
Code |
- |
A step-by-step tutorial with a small dataset to help you quickly utilize SAM. |
SAM Fails to Segment Anything? |
|
arXiv |
- |
Code |
- |
SAM-adapter: Adapting SAM in Underperformed Scenes: Camouflage, Shadow, Medical Image Segmentation, and More. |
Segment Anything Model for Medical Image Analysis: an Experimental Study |
|
arXiv |
- |
- |
- |
Thorough experiments evaluating how SAM performs on 19 medical image datasets. |
Medical-SAM-Adapter |
|
arXiv |
- |
- |
- |
A project to fineturn SAM using Adaption for the Medical Imaging. |
SAM-Med2d |
|
arXiv |
- |
Code |
Sichuan University & Shanghai AI Laboratory |
The most comprehensive studies on applying SAM to medical 2D images |
Title |
Presentation |
Paper page |
Project page |
Code base |
Affiliation |
Description |
Segment Anything for Microscopy |
|
bioRxiv |
Demo |
Code |
University of Göttingen, Germany |
Segment Anything for Microscopy implements automatic and interactive annotation for microscopy data. It is built on top of Segment Anything and specializes it for microscopy and other bio-imaging data. Its core components are: - The
micro_sam tools for interactive data annotation with napari. - The
micro_sam library to apply Segment Anything to 2d and 3d data or fine-tune it on your data. - The
micro_sam models that are fine-tuned on publicly available microscopy data. Our goal is to build fast and interactive annotation tools for microscopy data
|
Title |
Presentation |
Paper page |
Project page |
Code base |
Affiliation |
Description |
Inpaint Anything |
|
arXiv |
- |
Code |
USTC & EIT |
SAM + Inpainting, which is able to remove the object smoothly. |
SAM + Stable Diffusion for Text-to-Image Inpainting |
|
- |
Project |
Code |
comet |
Grounding DINO + SAM + Stable Diffusion |
Title |
Presentation |
Paper page |
Project page |
Code base |
Affiliation |
Description |
SAMCOD |
- |
arXiv |
- |
Code |
- |
SAM + Camouflaged object detection (COD) task. |
Title |
Presentation |
Paper page |
Project page |
Code base |
Affiliation |
Description |
Segment Anything in Video Super-resolution |
|
arXiv |
- |
- |
- |
The first step to use SAM for low-level vision. |
SAM-IQA |
|
arXiv |
- |
Code |
Megvii |
The first to introduce the SAM in IQA and demonstrate its strong generalization ability in this domain. |
Title |
Presentation |
Paper page |
Project page |
Code base |
Affiliation |
Description |
Matte Anything |
|
arXiv |
- |
Code |
HUST Vision Lab |
An interactive natural image matting system with excellent performance for both opaque and transparent objects |
Matting Anything |
|
arXiv |
Project page |
Code |
SHI Labs |
Leverage feature maps from SAM and adopts a Mask-to-Matte module to predict the alpha matte. |
Title |
Presentation |
Paper page |
Project page |
Code base |
Affiliation |
Description |
Instruct2Act |
|
arXiv |
- |
Code |
OpenGVLab |
A SAM application in the Robotic field. |
Title |
Presentation |
Paper page |
Project page |
Code base |
Affiliation |
Description |
IAMSAM |
|
bioRxiv |
- |
Code |
Portrai Inc. |
A SAM application for the analysis of Spatial Transcriptomics. |
Title |
Presentation |
Paper page |
Project page |
Code base |
Affiliation |
Description |
Seal |
|
arXiv |
Page |
Code |
- |
A framework capable of leveraging 2D vision foundation models for self-supervised learning on large-scale 3D point clouds. |
TomoSAM |
|
arXiv |
Video Tutorial |
Code |
- |
An extension of 3D Slicer using the SAM to aid the segmentation of 3D data from tomography or other imaging techniques. |
SegmentAnythingin3D |
|
arXiv |
Project |
Code |
- |
A novel framework to Segment Anything in 3D, named SA3D. |
Title |
Presentation |
Paper page |
Project page |
Code base |
Affiliation |
Description |
RSPrompter |
|
arXiv |
Project Page |
Code |
Beihang University |
An automated instance segmentation approach for remote sensing images based on the SAM. |
Title |
Presentation |
Paper page |
Project page |
Code base |
Affiliation |
Description |
Follow Anything |
|
arXiv |
Page |
Code |
MIT, Harvard University |
an open-vocabulary and multimodal model to detects, tracks, and follows any objects in real-time. |
Track-Anything |
Video |
arXiv |
- |
Code |
MIT, Harvard University |
an open-vocabulary and multimodal model to detects, tracks, and follows any objects in real-time. |
SAM-Track |
Video |
arXiv |
- |
Code |
MIT, Harvard University |
A framework called Segment And Track Anything (SAMTrack) that allows users to precisely and effectively segment and track any object in a video. |
Title |
Presentation |
Paper page |
Project page |
Code base |
Affiliation |
Description |
AV-SAM |
|
arXiv |
- |
Code |
CMU |
A simple yet effective audio-visual localization and segmentation framework based on the SAM. |
Title |
Presentation |
Paper page |
Project page |
Code base |
Affiliation |
Description |
Attack-SAM |
- |
arXiv |
- |
- |
KAIST |
The |
first work of conduct a comprehensive investigation on how to attack SAM with adversarial |
|
|
|
|
|
|
examples. |
|
|
|
|
|
|
Title |
Presentation |
Project page |
Code base |
Affiliation |
Description |
Grounded Segment Anything |
|
Colab & Huggingface |
Code |
- |
Combining Grounding DINO and Segment Anything |
GroundedSAM Anomaly Detection |
|
- |
Code |
- |
Grounding DINO + SAM to segment any anomaly. |
Semantic Segment Anything |
|
- |
Code |
Fudan |
A dense category annotation engine. |
Magic Copy |
|
- |
Code |
- |
Magic Copy is a Chrome extension that uses SAM. |
Segment Anything with Clip |
|
- |
Code |
- |
SAM + CLIP |
SAM-Clip |
|
- |
Code |
- |
SAM + CLIP. |
Prompt Segment Anything |
|
- |
Code |
- |
SAM + Zero-shot Instance Segmentation. |
RefSAM |
- |
- |
Code |
- |
Evaluating the basic performance of SAM on the Referring Image segmentation task. |
SAM-RBox |
|
- |
Code |
- |
An implementation of SAM for generating rotated bounding boxes with MMRotate. |
Open Vocabulary Segment Anything |
|
- |
Code |
- |
An interesting demo by combining OWL-ViT of Google and SAM. |
SegDrawer |
|
- |
Code |
- |
Simple static web-based mask drawer, supporting semantic drawing with SAM. |
AnyLabeling |
|
YoutubeDemo |
Code |
- |
SAM + Labelme + LabelImg + Auto-labeling. |
Annotation Anything Pipeline |
|
- |
Code |
- |
GPT + SAM. |
Roboflow Annotate |
|
App |
Blog |
Roboflow |
SAM-assisted labeling for training computer vision models. |
SALT |
|
- |
Code |
- |
A tool that adds a basic interface for image labeling and saves the generated masks in COCO format.] |
SAM U Specify |
|
- |
Code |
- |
Use SAM and CLIP model to segment unique instances you want.] |
SAM web UI |
|
App |
Code |
- |
This is a new web interface for the SAM. |
Finetune Anything |
|
- |
Code |
- |
A class-aware one-stage tool for training fine-tuning models based on SAM. |
Title |
Presentation |
Project page |
Code base |
Affiliation |
Description |
MetaSeg |
|
HuggingFace |
Code |
- |
SAM + Video. |
SAM-Track |
Video |
YoutubeDemo |
Code |
Zhejiang University |
This project, which is based on SAM and DeAOT, focuses on segmenting and tracking objects in videos. |
Title |
Presentation |
Project page |
Code base |
Affiliation |
Description |
SAM in Napari |
Video |
- |
Code |
- |
Segment anything with Napari integration of SAM. |
SAM Medical Imaging |
|
- |
Code |
- |
SAM for Medical Imaging. |
Title |
Presentation |
Project page |
Code base |
Affiliation |
Description |
SegAnythingPro |
|
- |
Code |
- |
SAM + Inpainting/Replacing. |
Title |
Presentation |
Project page |
Code base |
Affiliation |
Description |
3D-Box |
|
- |
Code |
- |
SAM is extended to 3D perception by combining it with VoxelNeXt. |
Anything 3DNovel View |
|
- |
Code |
- |
SAM + Zero 1-to-3. |
Any 3DFace |
|
- |
Code |
- |
SAM + HRN. |
Segment Anything 3D |
|
- |
Code |
Pointcept |
Extending Segment Anything to 3D perception by transferring the segmentation information of 2D images to 3D space |
Title |
Presentation |
Project page |
Code base |
Affiliation |
Description |
Edit Anything |
|
- |
Code |
- |
Edit and Generate Anything in an image. |
Image Edit Anything |
|
- |
Code |
- |
Stable Diffusion + SAM. |
SAM for Stable Diffusion Webui |
|
- |
Code |
- |
Stable Diffusion + SAM. |
Title |
Presentation |
Project page |
Code base |
Affiliation |
Description |
Earth Observation Tools |
|
Colab |
Code |
- |
SAM + Remote Sensing. |
Title |
Presentation |
Project page |
Code base |
Affiliation |
Description |
Moving Object Detection |
|
- |
Code |
- |
SAM + Moving Object Detection. |
Title |
Presentation |
Project page |
Code base |
Affiliation |
Description |
OCR-SAM |
|
Blog |
Code |
- |
Optical Character Recognition with SAM. |
Title |
Presentation |
Project page |
Code base |
Affiliation |
Description |
SAMJS |
|
demo |
Code |
- |
JS SDK for SAM, Support remote sensing data segmentation and vectorization |
Some of the presentations in this repository are borrowed from the original author, and we are very thankful for their contribution.
This project is released under the MIT license. Please see the LICENSE file for more information.