• Stars
    star
    233
  • Rank 172,230 (Top 4 %)
  • Language
    Python
  • Created about 4 years ago
  • Updated almost 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Code for our ACMMM2020 paper "Context-aware Feature Generation for Zero-shot Semantic Segmentation".

CaGNet: Context-aware Feature Generation for Zero-shot Semantic Segmentation

Code for our ACM MM 2020 paper "Context-aware Feature Generation for Zero-shot Semantic Segmentation".

Created by Zhangxuan Gu, Siyuan Zhou, Li Niu*, Zihan Zhao, Liqing Zhang*.

Paper Link: [arXiv]

News

In our journal extension CaGNetv2 [arXiv, github], we extend pixel-wise feature generation and finetuning to patch-wise feature generation and finetuning.

Visualization on Pascal-VOC

Visualization on Pascal-VOC

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{Gu2020CaGNet,
  title={Context-aware Feature Generation for Zero-shot Semantic Segmentation},
  author={Zhangxuan Gu and Siyuan Zhou and Li Niu and Zihan Zhao and Liqing Zhang},
  booktitle={ACM International Conference on Multimedia},
  year={2020}
}

Introduction

Existing semantic segmentation models heavily rely on dense pixel-wise annotations. To reduce the annotation pressure, we focus on a challenging task named zero-shot semantic segmentation, which aims to segment unseen objects with zero annotations. This can be achieved by transferring knowledge across categories via semantic word embeddings. In this paper, we propose a novel context-aware feature generation method for zero-shot segmentation named as CaGNet. In particular, with the observation that a pixel-wise feature highly depends on its contextual information, we insert a contextual module in a segmentation network to capture the pixel-wise contextual information, which guides the process of generating more diverse and context-aware features from semantic word embeddings. Our method achieves state-of-the-art results on three benchmark datasets for zero-shot segmentation.

Overview of Our CaGNet

Experiments

Basic Settings

  • Inductive or Transductive:

    Inductive -> No test samples (images and annotations) are available during training (including finetuning).

  • Generalized or Non-generalized:

    Generalized -> Both seen and unseen categories can appear in test samples.

  • Baselines:

    SPNet [github, paper] & ZS3Net [github, paper]

  • Backbone Network:

    DeepLabV2 (ResNet-101) pre-trained on ImageNet (following SPNet)

  • Semantic Word Embedding:

    Word2vec (300-dim) & FastText (300-dim)

  • Datasets:

    • Pascal-Context

      Samples: 4998 train / 5105 test

      Split: 33 classes including 29 seen / 4 unseen "cow, motorbike, sofa, cat"

    • COCO-Stuff

      Samples: 118288 train / 5001 test

      Split: 182 classes including 167 seen / 15 unseen (following SPNet)

    • Pascal-VOC and SBD (Semantic Boundary Dataset)

      Samples: 11685 train / 1449 test

      Split: 20 classes including 15 seen / 5 unseen (following SPNet)

  • "Background" or Not:

    ZS3Net uses the word embedding of "background" as the semantic representation of all categories (e.g., sky and ground) belonging to "background", which seems a little unreasonable, while SPNet ignores "background" in both training and testing. Although including "background" can bring large performance gain, we follow SPNet and ignore it all the time.

  • Additional Operation on Train Samples:

    Since train images may contain pixels that do not belong to seen categories (e.g. unseen categories, background, or no label), we mark the annotations of these pixels as 'ignored' so that only seen categories are visible during training (including finetuning).

Results

“ST” in the following tables stands for self-training mentioned in ZS3Net.

Our Results on Pascal-Context dataset

Method hIoU mIoU pixel acc. mean acc. S-mIoU U-mIoU
SPNet 0 0.2938 0.5793 0.4486 0.3357 0
SPNet-c 0.0718 0.3079 0.5790 0.4488 0.3514 0.0400
ZS3Net 0.1246 0.3010 0.5710 0.4442 0.3304 0.0768
CaGNet 0.2061 0.3347 0.5924 0.4900 0.3610 0.1442
ZS3Net+ST 0.1488 0.3102 0.5725 0.4532 0.3398 0.0953
CaGNet+ST 0.2252 0.3352 0.5961 0.4962 0.3644 0.1630

Our Results on COCO-Stuff dataset

Method hIoU mIoU pixel acc. mean acc. S-mIoU U-mIoU
SPNet 0.0140 0.3164 0.5132 0.4593 0.3461 0.0070
SPNet-c 0.1398 0.3278 0.5341 0.4363 0.3518 0.0873
ZS3Net 0.1495 0.3328 0.5467 0.4837 0.3466 0.0953
CaGNet 0.1819 0.3345 0.5658 0.4845 0.3549 0.1223
ZS3Net+ST 0.1620 0.3367 0.5631 0.4862 0.3489 0.1055
CaGNet+ST 0.1946 0.3372 0.5676 0.4854 0.3555 0.1340

Our Results on Pascal-VOC dataset

Method hIoU mIoU pixel acc. mean acc. S-mIoU U-mIoU
SPNet 0.0002 0.5687 0.7685 0.7093 0.7583 0.0001
SPNet-c 0.2610 0.6315 0.7755 0.7188 0.7800 0.1563
ZS3Net 0.2874 0.6164 0.7941 0.7349 0.7730 0.1765
CaGNet 0.3972 0.6545 0.8068 0.7636 0.7840 0.2659
ZS3Net+ST 0.3328 0.6302 0.8095 0.7382 0.7802 0.2115
CaGNet+ST 0.4366 0.6577 0.8164 0.7560 0.7859 0.3031

Please note that our reproduced results of SPNet on Pascal-VOC dataset are obtained using their released model and code with careful tuning, but still lower than their reported results.

Hardware Dependency

Our released code temporarily supports a single GPU or multiple GPUs. To acquire satisfactory training results, we advise that each GPU card should be at least 32GB with batch size larger than 8.

The results in the conference paper / this repository are obtained on a single 32GB GPU with batch size 8. If you use multiple GPUs (each ≥ 32GB) to train CaGNet, you might hopefully achieve better results.

Getting Started

Installation

1.Clone this repository.

git clone https://github.com/bcmi/CaGNet-Zero-Shot-Semantic-Segmentation.git

2.Create python environment for CaGNet via conda.

conda env create -f CaGNet_environment.yaml

3.Download dataset.

  • Pascal-VOC

    --> CaGNet_VOC2012_data.tar : BCMI-Cloud or BaiduNetDisk (extraction code: beau)

    1. download the above .tar file into directory ./dataset/voc12/
    2. uncompress it to form ./dataset/voc12/images/ and ./dataset/voc12/annotations/
  • Pascal-Context

    --> CaGNet_context_data.tar : BCMI-Cloud or BaiduNetDisk (extraction code: rk29)

    1. download the above .tar file into directory ./dataset/context/
    2. uncompress it to form ./dataset/context/images/ and ./dataset/context/annotations/
  • COCO-Stuff

    1. follow the setup instructions on the COCO-Stuff homepage to obtain two folders: images and annotations.
    2. move the above two folders into directory ./dataset/cocostuff/ to form ./dataset/cocostuff/images/ and ./dataset/cocostuff/annotations/

4.Download pre-trained weights and our optimal models into directory ./trained_models/

  • deeplabv2 pretrained weight for Pascal-VOC and Pascal-Context

    --> deeplabv2_resnet101_init.pth : BCMI-Cloud or BaiduNetDisk (extraction code: 5o0m)

  • SPNet pretrained weight for COCO-Stuff

    --> spnet_cocostuff_init.pth : BCMI-Cloud or BaiduNetDisk (extraction code: qjpo)

  • our best model on Pascal-VOC

    --> voc12_ourbest.pth : BCMI-Cloud or BaiduNetDisk (extraction code: nxj4)

  • our best model on Pascal-Context

    --> context_ourbest.pth : BCMI-Cloud or BaiduNetDisk (extraction code: 0x2i)

  • our best model on COCO-Stuff

    --> cocostuff_ourbest.pth : BCMI-Cloud or BaiduNetDisk (extraction code: xl88)

Training

1.Train on Pascal-VOC dataset

python train.py --config ./configs/voc12.yaml --schedule step1
python train.py --config ./configs/voc12_finetune.yaml --schedule mixed

2.Train on Pascal-Context dataset

python train.py --config ./configs/context.yaml --schedule step1
python train.py --config ./configs/context_finetune.yaml --schedule mixed

3.Train on COCO-Stuff dataset

python train.py --config ./configs/cocostuff.yaml --schedule step1
python train.py --config ./configs/cocostuff_finetune.yaml --schedule mixed

Testing

1.Test our best model on Pascal-VOC dataset

python train.py --config ./configs/voc12.yaml --init_model ./trained_models/voc12_ourbest.pth --val

2.Test our best model on Pascal-Context dataset

python train.py --config ./configs/context.yaml --init_model ./trained_models/context_ourbest.pth --val

3.Test our best model on COCO-Stuff dataset

python train.py --config ./configs/cocostuff.yaml --init_model ./trained_models/cocostuff_ourbest.pth --val

Visualization

COMING SOON !

Try on Custom Data

COMING SOON !

Acknowledgement

Some of the codes are built upon FUNIT and SPNet. Thanks them for their great work!

If you get any problems or if you find any bugs, don't hesitate to comment on GitHub or make a pull request!

CaGNet is freely available for non-commercial use, and may be redistributed under these conditions. For commercial queries, please drop an e-mail. We will send the detail agreement to you.

More Repositories

1

Awesome-Image-Composition

A curated list of papers, code and resources pertaining to image composition/compositing or object insertion, which aims to generate realistic composite image.
1,171
star
2

Image-Harmonization-Dataset-iHarmony4

[CVPR 2020] The first large-scale public benchmark dataset for image harmonization. The code used in our paper "DoveNet: Deep Image Harmonization via Domain Verification", CVPR2020. Useful for image harmonization, image composition, etc.
MATLAB
764
star
3

libcom

Image composition toolbox: everything you want to know about image composition or object insertion
Python
499
star
4

Awesome-Image-Harmonization

A curated list of papers, code and resources pertaining to image harmonization.
425
star
5

DCI-VTON-Virtual-Try-On

[ACM Multimedia 2023] Taming the Power of Diffusion Models for High-Quality Virtual Try-On with Appearance Flow.
Python
398
star
6

Awesome-Few-Shot-Image-Generation

A curated list of papers, code and resources pertaining to few-shot image generation.
366
star
7

Awesome-Aesthetic-Evaluation-and-Cropping

276
star
8

SLBR-Visible-Watermark-Removal

[ACM MM 2021] Visible Watermark Removal via Self-calibrated Localization and Background Refinement
Python
214
star
9

Awesome-Weak-Shot-Learning

A curated list of papers, code and resources pertaining to weak-shot classification, detection, and segmentation.
183
star
10

Object-Shadow-Generation-Dataset-DESOBA

[AAAI 2022] The first dataset on foreground object shadow generation for image composition in real-world scenes. The code used in our paper "Shadow Generation for Composite Image in Real-world Scenes", AAAI2022. Useful for shadow generation, shadow removal, image composition, etc.
Python
165
star
11

ControlCom-Image-Composition

A controllable image composition model which could be used for image blending, image harmonization, view synthesis.
Python
141
star
12

CDTNet-High-Resolution-Image-Harmonization

[CVPR 2022] We unify pixel-to-pixel transformation and color-to-color transformation in a coherent framework for high-resolution image harmonization. We also release 100 high-resolution real composite images for evaluation.
Python
124
star
13

Image-Composition-Assessment-Dataset-CADB

[BMVC2021] The first image composition assessment dataset. Used in the paper "Image Composition Assessment with Saliency-augmented Multi-pattern Pooling". Useful for image composition assessment, image aesthetic assesment, etc.
Python
112
star
14

Awesome-Visible-Watermark-Removal

102
star
15

Object-Shadow-Generation-Dataset-DESOBAv2

[CVPR 2024] The dataset, code, and model for our paper "Shadow Generation for Composite Image Using Diffusion Model", CVPR, 2024.
Python
102
star
16

GracoNet-Object-Placement

[ECCV 2022] Official code for "Learning Object Placement via Dual-path Graph Completion"
Python
100
star
17

Awesome-Object-Shadow-Generation

A curated list of papers, code, and resources pertaining to object shadow generation.
89
star
18

Awesome-Generative-Image-Composition

A curated list of papers, code, and resources pertaining to generative image composition or object insertion.
Python
78
star
19

F2GAN-Few-Shot-Image-Generation

Fusing-and-Filling GAN (F2GAN) for few-shot image generation, ACM MM2020
Python
78
star
20

Awesome-Object-Placement

A curated list of papers, code, and resources pertaining to object placement.
75
star
21

Object-Placement-Assessment-Dataset-OPA

The first dataset of composite images with rationality score indicating whether the object placement in a composite image is reasonable.
Python
74
star
22

BargainNet-Image-Harmonization

BargainNet: Background-Guided Domain Translation for Image Harmonization. Useful for Image harmonization, image composition, etc.
Python
67
star
23

SimTrans-Weak-Shot-Classification

[NeurIPS 2021] The first weak-shot classification paper.
Python
63
star
24

SSP-AI-Generated-Image-Detection

The code for "A Single Simple Patch is All You Need for AI-generated Image Detection"
Python
62
star
25

Video-Harmonization-Dataset-HYouTube

[IJCAI 2022] The first public benchmark dataset for video harmonization. The code used in our paper "Deep Video Harmonization with Color Mapping Consistency", IJCAI 2022.
Python
59
star
26

ObjectStitch-Image-Composition

An unofficial implementation of the paper "ObjectStitch: Object Compositing with Diffusion Model", CVPR 2023.
Python
55
star
27

TraMaS-Weak-Shot-Object-Detection

[NeurIPS 2021] A weak-shot object detection approach by transferring semantic similarity and mask prior.
Python
52
star
28

PHDiffusion-Painterly-Image-Harmonization

[ACM MM 2023] The code used in our paper "Painterly Image Harmonization using Diffusion Model", ACM MM2023.
Python
51
star
29

DeltaGAN-Few-Shot-Image-Generation

[ECCV 2022] Generate sample-specific intra-category deltas for few-shot image generation.
Python
50
star
30

PHDNet-Painterly-Image-Harmonization

[AAAI 2023] Painterly image harmonization in both spatial domain and frequency domain.
Python
50
star
31

Causal-VidQA

[CVPR 2022] A large-scale public benchmark dataset for video question-answering, especially about evidence and commonsense reasoning. The code used in our paper "From Representation to Reasoning: Towards both Evidence and Commonsense Reasoning for Video Question-Answering", CVPR2022.
Python
50
star
32

SimFormer-Weak-Shot-Semantic-Segmentation

Python
44
star
33

DucoNet-Image-Harmonization

[ACM MM 23] Deep image harmonization in Dual Color Space
Python
38
star
34

Awesome-Image-Blending

A curated list of papers, code and resources pertaining to image blending.
38
star
35

CaGNetv2-Zero-Shot-Semantic-Segmentation

Code for "From Pixel to Patch: Synthesize Context-aware Features for Zero-shot Semantic Segmentation".
Python
36
star
36

SycoNet-Adaptive-Image-Harmonization

[ICCV 2023] The code used in our paper "Deep Image Harmonization with Learnable Augmentation", ICCV2023.
Python
35
star
37

FOPA-Fast-Object-Placement-Assessment

A discriminative object placement approach
Python
32
star
38

MatchingGAN-Few-Shot-Image-Generation

code for Matchinggan: Matching-Based Few-Shot Image Generation
Python
30
star
39

ProPIH-Painterly-Image-Harmonization

[AAAI2024] Progressive Painterly Image Harmonization from Low-level Styles to High-level Styles
Python
24
star
40

RETAB-Weak-Shot-Semantic-Segmentation

Official Implementation for Weak-shot Semantic Segmentation by Transferring Semantic Affinity and Boundary (BMVC 2022)
Python
24
star
41

ArtoPIH-Painterly-Image-Harmonization

[AAAI2024] Painterly Image Harmonization by Learning from Painterly Objects
Python
24
star
42

Human-Centric-Image-Cropping

Official implementation for ECCV2022 paper: Human-centric Image Cropping with Partition-aware and Content-preserving Features.
Python
24
star
43

DIRL-Inharmonious-Region-Localization

[ICME2021]The first work on Deep Inharmonious Region Localization, which can help image harmonization in an adversarial way.
Python
24
star
44

Accessory-Try-On-Dataset-STRAT

A virtual accessory try-on dataset which could be used for image composition
Python
21
star
45

TopNet-Object-Placement

An unofficial implementation of the paper "TopNet: Transformer-based Object Placement Network for Image Compositing", CVPR 2023.
Python
20
star
46

Foreground-Object-Search-Dataset-FOSD

[ICCV 2023] The datasets and code used in our paper "Foreground Object Search by Distilling Composite Image Feature", ICCV2023.
Python
19
star
47

Composite-Image-Evaluation

19
star
48

stock-price-prediction

Fall 18' Class Project for Artificial Intelligence
Jupyter Notebook
19
star
49

Rendered-Shadow-Generation-Dataset-RdSOBA

[AAAI 2024] The dataset used in our paper "Shadow Generation with Decomposed Mask Prediction and Attentive Shadow Filling", AAAI 2024.
19
star
50

DreamCom-Image-Composition

A simple baseline for image composition using text-guided inpainting model
Python
18
star
51

MadisNet-Inharmonious-Region-Localization

[AAAI 2022] MadisNet: Inharmonious Region Localization by Magnifying Domain Discrepancy
Python
16
star
52

Rendered-Image-Harmonization-Dataset-RdHarmony

The first rendered image harmonization dataset. Used in our paper "CharmNet: Deep Image Harmonization by Bridging the Reality Gap". Useful for Image harmonization, image composition, etc.
Python
16
star
53

Image-Harmonization-Dataset-ccHarmony

[ICCV 2023] The color checker based harmonization dataset contributed in our paper "Deep Image Harmonization with Globally Guided Feature Transformation and Relation Distillation", ICCV2023.
Python
16
star
54

Color-Transfer-for-Image-Harmonization

Summarize different color transfer strategies for image harmonization task.
MATLAB
15
star
55

Awesome-Foreground-Object-Search

A curated list of papers, code, and resources pertaining to foreground object search.
13
star
56

AustNet-Inharmonious-Region-Localization

[BMVC2022] Inharmonious Region Localization with Auxiliary Style Feature
Python
13
star
57

GPSDiffusion-Object-Shadow-Generation

4
star
58

toolbox

Python
4
star
59

MureObjectStitch-Image-Composition

Python
3
star
60

iConReg

Regulating contagion risk to curb the systemic crisis in loan networks though deep graph learning
2
star
61

Awesome-Video-Composition

2
star
62

ESL-chinese

Element of Statistical Learning 中文翻译版
1
star
63

anormal-detection

anormal detection of time series data.
1
star
64

financial-time-series

Learning Causal Relationships in Financial Time Series
1
star