• Stars
    star
    263
  • Rank 154,804 (Top 4 %)
  • Language
    Python
  • Created over 1 year ago
  • Updated about 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

The official repo for [NeurIPS'23] "SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model"

SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model

Di Wang, Jing Zhang, Bo Du, Minqiang Xu, Lin Liu, Dacheng Tao, Liangpei Zhang

News | Abstract | Usage | Results | Statement

News

2023.03.25

  • The rotated bounding box version of SOTA (SOTA-RBB) can be obtained from [Dataset] and [Baidu]

2023.12.07

  • The SAMRS dataset can be acquired from [Baidu]

2023.09.30

  • The instance and detection labels are released! See in [Dataset]

2023.09.26

  • The NeurIPS version is post on arxiv!

2023.09.23

  • The codes of generating SAMRS dataset are released!

2023.09.22

  • The paper is accepted by NeurIPS 2023 Datasets and Benchmarks Track!

2023.08.30

  • The SAMRS images are released! See in [Dataset]

2023.06.14

  • The semantic labels are released! See in [Dataset]

2023.05.04

  • The tech report is post on arxiv! Work in progress.

Other applications of ViTAE inlcude: VSA | ViTPose | Matting | Scene Text Spotting | Video Object Segmentation

Introduction

This is the official repository of the paper Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model

Figure 1: Some examples of SAM segmentation results on remote sensing images.

In this study, we leverage SAM and existing RS object detection datasets to develop an efficient pipeline for generating a large-scale RS segmentation dataset, dubbed SAMRS. SAMRS surpasses existing high-resolution RS segmentation datasets in size by several orders of magnitude, and provides object category, location, and instance information that can be used for semantic segmentation, instance segmentation, and object detection, either individually or in combination. We also provide a comprehensive analysis of SAMRS from various aspects. We hope it could facilitate research in RS segmentation, particularly in large model pre-training.

Usage

Results

The basic information of generated datasets

Figure 2: Comparisons of different high-resolution RS segmentation datasets.

We present the comparison of our SAMRS dataset with existing high-resolution RS segmentation datasets in table. Based on the available high-resolution RSI object detection datasets, we can efficiently annotate 10,5090 images, which is more than ten times the capacity of existing datasets. Additionally, SAMRS inherits the categories of the original detection datasets, which makes them more diverse than other high-resolution RS segmentation collections. It is worth noting that RS object datasets usually have more diverse categories than RS segmentation datasets due to the difficulty of tagging pixels in RSIs, and thus our SAMRS reduces this gap.

Visualization of Generated Masks

Figure 3: Some visual examples from the three subsets of our SAMRS dataset.

In figure, we visualize some segmentation annotations from the three subsets in our SAMRS dataset. As can be seen, SOTA exhibits a greater number of instances for tiny cars, whereas FAST provides a more fine-grained annotation of existing categories in SOTA such as car, ship, and plane. SIOR on the other hand, offers annotations for more diverse ground objects, such as dam. Hence, our SAMRS dataset encompasses a wide range of categories with varying sizes and distributions, thereby presenting a new challenge for RS semantic segmentation.

Dataset Statistics and Analysis

The class distribution.

Figure 4: Statistics of the number of pixels and instances for each category in the SAMRS database. The histograms for the subsets SOTA, SIOR, and FAST are shown in the first, second, and third columns, respectively. The first row presents histograms on a per-pixel basis, while the second row presents histograms on a per-instance basis.

The mask size distribution.

Figure 5: Statistics of the mask sizes in different subsets of the SAMRS database. (a) SOTA. (b) SIOR. (c) FAST.

Statement

This project is for research purpose only. For any other questions please contact [email protected].

Citation

If you find SAMRS helpful, please consider giving this repo a ⭐ and citing:

@inproceedings{SAMRS,
  title={{SAMRS}: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model},
  author={Di Wang and Jing Zhang and Bo Du and Minqiang Xu and Lin Liu and Dacheng Tao and Liangpei Zhang},
  booktitle={Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
  year={2023},
  url={https://openreview.net/forum?id=jHrgq55ftl}
}

Relevant Projects

[1] An Empirical Study of Remote Sensing Pretraining, IEEE TGRS, 2022 | Paper | Github
     Di Wangβˆ—, Jing Zhangβˆ—, Bo Du, Gui-Song Xia and Dacheng Tao

[2] Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model, IEEE TGRS, 2022 | Paper | Github
     Di Wangβˆ—, Qiming Zhangβˆ—, Yufei Xuβˆ—, Jing Zhang, Bo Du, Dacheng Tao and Liangpei Zhang

More Repositories

1

ViTPose

The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation" and [TPAMI'23] "ViTPose++: Vision Transformer for Generic Body Pose Estimation"
Python
1,313
star
2

ViTDet

Unofficial implementation for [ECCV'22] "Exploring Plain Vision Transformer Backbones for Object Detection"
Python
524
star
3

ViTAE-Transformer-Remote-Sensing

A comprehensive list [SAMRS@NeurIPS'23, RVSA@TGRS'22, RSP@TGRS'22] of our research works related to remote sensing, including papers, codes, and citations. Note: The repo for [TGRS'22] "An Empirical Study of Remote Sensing Pretraining" has been moved to: https://github.com/ViTAE-Transformer/RSP
TeX
446
star
4

Remote-Sensing-RVSA

The official repo for [TGRS'22] "Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model"
Python
403
star
5

ViTAE-Transformer

The official repo for [NeurIPS'21] "ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias" and [IJCV'22] "ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image Recognition and Beyond"
Python
249
star
6

ViTAE-Transformer-Matting

A comprehensive list [AIM@IJCAI'21, P3M@MM'21, GFM@IJCV'22, RIM@CVPR'23, P3MNet@IJCV'23] of our research works related to image matting, including papers, codes, datasets, demos, and citations. Note: The repo for [IJCV'23] "Rethinking Portrait Matting with Privacy Preserving" has been moved to: https://github.com/ViTAE-Transformer/P3M-Net
TeX
229
star
7

QFormer

The official repo for [TPAMI'23] "Vision Transformer with Quadrangle Attention"
Python
158
star
8

ViTAE-VSA

The official repo for [ECCV'22] "VSA: Learning Varied-Size Window Attention in Vision Transformers"
Python
152
star
9

MTP

The official repo for [JSTARS'24] "MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining"
Python
140
star
10

RSP

The official repo for [TGRS'22] "An Empirical Study of Remote Sensing Pretraining"
Python
130
star
11

P3M-Net

The official repo for [IJCV'23] "Rethinking Portrait Matting with Privacy Preserving"
Python
90
star
12

DeepSolo

[CVPR 2023] DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting
Python
68
star
13

ViTAE-Transformer-Scene-Text-Detection

The official repo for [IJCV'22] I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection
Python
37
star
14

LeMeViT

The official repo for [IJCAI'24] "LeMeViT: Efficient Vision Transformer with Learnable Meta Tokens for Remote Sensing Image Interpretation"
Python
37
star
15

SimDistill

The official repo for [AAAI 2024] "SimDistill: Simulated Multi-modal Distillation for BEV 3D Object Detection""
Python
22
star
16

VOS-LLB

The official repo for [AAAI'23] "Learning to Learn Better for Video Object Segmentation"
Python
10
star
17

APTv2

The official repo for the extension of [NeurIPS'22] "APT-36K: A Large-scale Benchmark for Animal Pose Estimation and Tracking": https://github.com/pandorgan/APT-36K
Python
9
star
18

I3CL

The official repo for [IJCV'22] "I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection"
Python
2
star