• Stars
    star
    138
  • Rank 264,508 (Top 6 %)
  • Language
    Python
  • License
    MIT License
  • Created over 2 years ago
  • Updated almost 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022

PGNet

Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022,
CVPR 2022 (arXiv 2204.05041)

Abstract

Recent salient object detection (SOD) methods based on deep neural network have achieved remarkable performance. However, most of existing SOD models designed for low-resolution input perform poorly on high-resolution images due to the contradiction between the sampling depth and the receptive field size. Aiming at resolving this contradiction, we propose a novel one-stage framework called Pyramid Grafting Network (PGNet), using transformer and CNN backbone to extract features from different resolution images independently and then graft the features from transformer branch to CNN branch. An attention-based Cross-Model Grafting Module (CMGM) is proposed to enable CNN branch to combine broken detailed information more holistically, guided by different source feature during decoding process. Moreover, we design an Attention Guided Loss (AGL) to explicitly supervise the attention matrix generated by CMGM to help the network better interact with the attention from different models. We contribute a new Ultra-High-Resolution Saliency Detection dataset UHRSD, containing 5,920 images at 4K-8K resolutions. To our knowledge, it is the largest dataset in both quantity and resolution for high-resolution SOD task, which can be used for training and testing in future research. Sufficient experiments on UHRSD and widely-used SOD datasets demonstrate that our method achieves superior performance compared to the state-of-the-art methods.

Ultra High-Resolution Saliency Detection Dataset

Visual display for sample in UHRSD dataset. Best viewd by clikcing and zooming in.

To relief the lack of high-resolution datasets for SOD, we contribute the Ultra High-Resolution for Saliency Detection (UHRSD) dataset with a total of 5,920 images in 4K(3840 Γ— 2160) or higher resolution, including 4,932 images for training and 988 images for testing. A total of 5,920 images were manually selected from websites (e.g. Flickr Pixabay) with free copyright. Our dataset is diverse in terms of image scenes, with a balance of complex and simple salient objects of various size.

To our knowledge, it is the largest dataset in both quantity and resolution for high-resolution SOD task, which can be used for training and testing in future research.

  • Our UHRSD (Ultra High-Resolution Saliency Detection) Dataset:

We provide the original 4K version and the convenient 2K version of our UHRSD (Ultra High-Resolution Saliency Detection) Dataset for download: Google Drive

Usage

Requirements

  • Python 3.8
  • Pytorch 1.7.1
  • OpenCV
  • Numpy
  • Apex
  • Timm

Directory

The directory should be like this:

-- src 
-- model (saved model)
-- pre (pretrained model)
-- result (saliency maps)
-- data (train dataset and test dataset)
   |-- DUTS-TR+HR
   |   |-- image
   |   |-- mask
   |-- UHRSOD+HRSOD
   |   |--image
   |   |--mask
   ...
   

Train

cd src
./train.sh
  • We implement our method by PyTorch and conduct experiments on 2 NVIDIA 2080Ti GPUs.
  • We adopt pre-trained ResNet-18 and Swin-B-224 as backbone networks, which are saved in PRE folder.
  • We train our method on 3 settings : DUTS-TR, DUTS-TR+HRSOD and UHRSD_TR+HRSOD_TR.
  • After training, the trained models will be saved in MODEL folder.

Test

The trained model can be download here: Google Drive

cd src
python test.py
  • After testing, saliency maps will be saved in RESULT folder

Saliency Map

Trained on DUTS-TR:Google Drive

Trained on DUT+HRSOD:Google Drive

Trained on UHRSD+HRSOD:Google Drive

Citation

@inproceedings{xie2022pyramid,
    author    = {Xie, Chenxi and Xia, Changqun and Ma, Mingcan and Zhao, Zhirui and Chen, Xiaowu and Li, Jia},
    title     = {Pyramid Grafting Network for One-Stage High Resolution Saliency Detection},
    booktitle = {CVPR},
    year      = {2022}
}

More Repositories

1

TDRG

Transformer-based Dual Relation Graph for Multi-label Image Recognition. ICCV 2021
Python
38
star
2

CTDNet

Code for ACM MM2021 paper "Complementary Trilateral Decoder for Fast and Accurate Salient Object Detection"
Python
30
star
3

DASNet

Is Depth Really Necessary for Salient Object Detection? ACM MM 2020
Python
22
star
4

PART

Code for Part-Guided Relational Transformers for Fine-Grained Visual Recognition, appeared in TIP 2021
Python
19
star
5

DehazeFlow

DehazeFlow: Multi-scale Conditional Flow Network for Single Image Dehazing. ACM Conference on Multimedia (ACM MM), 2021
Python
18
star
6

HRCN

Heterogeneous Relational Complement for Vehicle Re-identification, ICCV 2021
Python
18
star
7

Pirt

Pose-guided Inter- and Intra-part Relational Transformer for Occluded Person Re-Identification ACM MM 2021
Python
15
star
8

PFSNet

Python
10
star
9

Dara

Dual Adaptive Representation Alignment for Cross-domain Few-shot Learning TPAMI 2023
Python
10
star
10

Gard

Code for Graph-based High-Order Relation Discovery for Fine-grained Recognition in CVPR 2021
Python
8
star
11

M3TR

M3TR: Multi-modal Multi-label Recognition with Transformer. ACM MM 2021
Python
7
star
12

DanceIt

Code for DanceIt: Music-inspired Dancing Video Synthesis. IEEE Transactions on Image Processing (TIP) 2021
Python
7
star
13

BBRF-TIP

Boosting Broader Receptive Fields for Salient Object Detection. TIP-2022
Python
6
star
14

SL-PeDG

Revisiting Stochastic Learning for Generalizable Person Re-identification in ACM MM 2022
Python
6
star
15

LETGAN

How to Learn a Domain Adaptive Event Simulator? ACM MM, 2021
Python
4
star
16

ODI-SOD

A 360Β° omnidirectional image-based salient object detection (SOD) dataset referred to as ODI-SOD with object-level pixel-wise annotation on equirectangular projection (ERP).
4
star
17

UTA

RGB-D Salient Object Detection with Ubiquitous Target Awareness. IEEE Transactions on Image Processing (TIP) 2021
Python
4
star
18

CBMNet

Cooperative Bi-path Metric for Few-shot Learning. ACM Conference on Multimedia (ACM MM), 2020
Python
4
star
19

Scob

Implementation of Semantic Contrastive Bootstrapping for Single-positive Multi-label Recognition, IJCV 2023
Python
4
star
20

G-FSCIL

Code for SCIENTIA SINICA Informationis paper "Generalized representation of local relationships for few-shot incremental learning", ε±€ιƒ¨ε…³η³»ζ³›εŒ–θ‘¨εΎηš„ε°ζ ·ζœ¬ε’žι‡ε­¦δΉ 
Python
4
star
21

InCo

Code for Invariant and consistent: Unsupervised representation learning for few-shot visual recognition. Neurocomputing 2023
Python
3
star
22

iCVTEAM.github.io

HTML
2
star
23

S3Net

PyTorch code for the ICME 2021 paper Selective, Structural, Subtle: Trilinear Spatial-Awareness for Few-Shot Fine-Grained Visual Recognition.
Python
2
star
24

IPSM

1
star