• Stars
    star
    109
  • Rank 317,168 (Top 7 %)
  • Language
    Python
  • License
    MIT License
  • Created over 5 years ago
  • Updated 12 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Code for "Domain Adaptation for Semantic Segmentation with Maximum Squares Loss" in PyTorch.

Domain Adaptation for Semantic Segmentation with Maximum Squares Loss

By Minghao Chen, Hongyang Xue, Deng Cai.

Introduction

A PyTorch implementation for our ICCV 2019 paper "Domain Adaptation for Semantic Segmentation with Maximum Squares Loss". The segmentation model is based on Deeplabv2 with ResNet-101 backbone. "MaxSquare+IW+Multi" introduced in the paper achieves competitive result on three UDA datasets: GTA5, SYNTHIA, CrossCity dataset. Moreover, our method achieves the state-of-the-art results in GTA5-to-Cityscapes and Cityscapes-to-CrossCity adaptation.

Citation

If you use this code in your research, please cite:

@InProceedings{Chen_2019_ICCV,
author = {Chen, Minghao and Xue, Hongyang and Cai, Deng},
title = {Domain Adaptation for Semantic Segmentation With Maximum Squares Loss},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}
}

Requirements

The code is implemented with Python(3.6) and Pytorch(1.0.0).

Install the newest Pytorch from https://pytorch.org/.

To install the required python packages, run

pip install -r requirements.txt

Setup

GTA5-to-Cityscapes:

  • Download GTA5 datasets, which contains 24,966 annotated images with 1914ร—1052 resolution taken from the GTA5 game. We use the sample code for reading the label maps and a split into training/validation/test set from here. In the experiments, we resize GTA5 images to 1280x720.
  • Download Cityscapes, which contains 5,000 annotated images with 2048 ร— 1024 resolution taken from real urban street scenes. We resize Cityscapes images to 1024x512 (or 1280x640 which yields sightly better results but costs more time).
  • Download the checkpoint pretrained on GTA5.
  • If you want to pretrain the model by yourself, download the model pretrained on ImageNet.

SYNTHIA-to-Cityscapes:

Cityscapes-to-CrossCity

  • Download NTHU dataset, which consists of images with 2048 ร— 1024 resolution from four different cities: Rio, Rome, Tokyo, and Taipei. We resize images to 1024x512, the same as Cityscapes.
  • Download the checkpoint pretrained on Cityscapes.

Put all datasets into "datasets" folder and all checkpoints into "pretrained_model" folder.

Results

We present several transfered results reported in our paper and provide the corresponding checkpoints.

results

GTA5-to-Cityscapes:

Method Source MinEnt MaxSquare MaxSquare+IW MaxSquare+IW+Multi
mIoU(%) 36.9 42.2 44.3 45.2 46.4

Cityscapes-to-CrossCity

Rome
Method Source MaxSquare MaxSquare+IW
mIoU(%) 51.0 53.9 54.5

Rio

Method Source MaxSquare MaxSquare+IW
mIoU(%) 48.9 52.0 53.3

Tokyo

Method Source MaxSquare MaxSquare+IW
mIoU(%) 47.8 49.7 50.5

Taipei

Method Source MaxSquare MaxSquare+IW
mIoU(%) 46.3 49.8 50.6

Training

GTA5-to-Cityscapes:

(Optional) Pretrain the model on the source domain (GTA5).

Otherwise, download the checkpoint pretrained on GTA5 in "Setup" section.

python3 tools/train_source.py --gpu "0" --dataset 'gta5' --checkpoint_dir "./log/gta5_pretrain/" --iter_max 200000 --iter_stop 80000 --freeze_bn False --weight_decay 5e-4 --lr 2.5e-4 --crop_size "1280,720"

Then in next step, set --pretrained_ckpt_file "./log/gta5_pretrain/gta5final.pth".

  • MaxSquare
python3 tools/solve_gta5.py --gpu "0" --backbone "deeplabv2_multi" --dataset 'cityscapes' --checkpoint_dir "./log/gta2city_AdaptSegNet_ST=0.1_maxsquare_round=5/" --pretrained_ckpt_file "./pretrained_model/GTA5_source.pth" --round_num 5 --target_mode "maxsquare" --freeze_bn False --weight_decay 5e-4 --lr 2.5e-4 --lambda_target 0.1
  • MaxSquare+IW
python3 tools/solve_gta5.py --gpu "0" --backbone "deeplabv2_multi" --dataset 'cityscapes' --checkpoint_dir "./log/gta2city_AdaptSegNet_ST=0.1_IW_maxsquare_round=5/" --pretrained_ckpt_file "./pretrained_model/GTA5_source.pth" --round_num 5 --target_mode "IW_maxsquare" --freeze_bn False --weight_decay 5e-4 --lr 2.5e-4 --lambda_target 0.1 --IW_ratio 0.2

Pretrain the multi-level model on the source domain (GTA5) by adding "--multi True".

python3 tools/train_source.py --gpu "0" --dataset 'gta5' --checkpoint_dir "./log/gta5_pretrain_multi/" --iter_max 200000 --iter_stop 80000 --freeze_bn False --weight_decay 5e-4 --lr 2.5e-4 --crop_size "1280,720" --multi True
  • MaxSquare+IW+Multi
python3 tools/solve_gta5.py --gpu "0" --backbone "deeplabv2_multi" --dataset 'cityscapes' --checkpoint_dir "./log/gta2city_AdaptSegNet_ST=0.09_IW_maxsquare_multi_round=5/" --pretrained_ckpt_file "./log/gta5_pretrain_multi/gta5best.pth" --round_num 5 --target_mode "IW_maxsquare" --freeze_bn False --weight_decay 5e-4 --lr 2.5e-4 --target_crop_size "1280,640" --lambda_target 0.09 --IW_ratio 0.2 --multi True --lambda_seg 0.1 --threshold 0.95

Eval:

python3 tools/evaluate.py --gpu "0" --dataset 'cityscapes' --checkpoint_dir "./log/eval_city" --pretrained_ckpt_file "./log/gta2city_AdaptSegNet_ST=0.1_maxsquare_round=5/gta52city_maxsquarebest.pth" --image_summary True --flip True

To have a look at predicted examples, run tensorboard as follows:

tensorboard --logdir=./log/eval_city  --port=6009

Cityscapes-to-CrossCity

(Optional) Pretrain the model on the source domain (Cityscapes).

python3 tools/train_source.py --gpu "0" --dataset 'cityscapes' --checkpoint_dir "./log/cityscapes_pretrain_class13/" --iter_max 200000 --iter_stop 80000 --freeze_bn False --weight_decay 5e-4 --lr 2.5e-4 --crop_size "1024,512" --num_classes 13
  • MaxSquare (take "Rome" for example)
python3 tools/solve_crosscity.py --gpu "0" --city_name 'Rome' --source_dataset 'cityscapes' --checkpoint_dir "./log/city2Rome_maxsquare/" --pretrained_ckpt_file "./pretrained_model/Cityscapes_source_class13.pth"  --crop_size "1024,512" --target_crop_size "1024,512"  --epoch_num 10 --target_mode "maxsquare" --lr 2.5e-4 --lambda_target 0.1 --num_classes 13

Acknowledgment

The structure of this code is largely based on this repo.

Deeplabv2 model is borrowed from Pytorch-Deeplab.

More Repositories

1

pixel_link

Implementation of our paper 'PixelLink: Detecting Scene Text via Instance Segmentation' in AAAI2018
Python
767
star
2

nsg

Navigating Spreading-out Graph For Approximate Nearest Neighbor Search
C++
584
star
3

MatlabFunc

Matlab codes for feature learning
MATLAB
502
star
4

ttfnet

Python
481
star
5

efanna

fast library for ANN search and KNN graph construction
C++
280
star
6

RMI

This is the code for the NeurIPS 2019 paper Region Mutual Information Loss for Semantic Segmentation.
Python
268
star
7

resa

Implementation of our paper 'RESA: Recurrent Feature-Shift Aggregator for Lane Detection' in AAAI2021.
Python
175
star
8

time_lstm

Python
152
star
9

SSG

code for satellite system graphs
C++
95
star
10

efanna_graph

an Extremely Fast Approximate Nearest Neighbor graph construction Algorithm framework
C++
79
star
11

graph_level_drug_discovery

Python
60
star
12

CariFaceParsing

Code for ICIP2019 paper๏ผšWeakly-supervised Caricature Face Parsing through Domain Adaptation
Python
55
star
13

AtSNE

Anchor-t-SNE for large-scale and high-dimension vector visualization
Cuda
54
star
14

ALDA

Code for "Adversarial-Learned Loss for Domain Adaptation"(AAAI2020) in PyTorch.
Python
49
star
15

depthInpainting

Depth Image Inpainting with Low Gradient Regularization
C++
48
star
16

AttentionZSL

Codes for Paper "Attribute Attention for Semantic Disambiguation in Zero-Shot Learning"
Python
44
star
17

ReDR

Code for ACL 2019 paper "Reinforced Dynamic Reasoning for Conversational Question Generation".
Python
41
star
18

hashingSearch

Search with a hash index
C++
31
star
19

SRDet

A simple, fast, efficient and end-to-end 3D object detector without NMS.
Python
30
star
20

PTL

Progressive Transfer Learning for Person Re-identification published on IJCAI-2019
Python
26
star
21

TreeAttention

A Better Way to Attend: Attention with Trees for Video Question Answering
Python
24
star
22

RPLSH

Kmeans Quantization + Random Projection based Locality Sensitive Hashing
C++
23
star
23

videoqa

Unifying the Video and Question Attentions for Open-Ended Video Question Answering
Python
21
star
24

DMP

Code for ACL 2018 paper "Discourse Marker Augmented Network with Reinforcement Learning for Natural Language Inference".
Python
17
star
25

DREN

DREN:Deep Rotation Equivirant Network
C++
15
star
26

Attention-GRU-3M

Python
13
star
27

AMI

Python
7
star
28

Sparse-Learning-with-Stochastic-Composite-Optimization

The implementation of our work "Sparse Learning with Stochastic Composite Optimization"
MATLAB
7
star
29

TransAt

Python
6
star
30

diverse_image_synthesis

PyTorch implementation of diverse conditional image synthesis
Python
4
star
31

DeAda

Decouple Co-adaptation: Classifier Randomization for Person Re-identification published on Neurocomputing.
Python
3
star
32

AdaDB

Python
2
star
33

SIF

SIF: Self-Inspirited Feature Learning for Person Re-Identification published on IEEE TIP
Python
2
star
34

SIFS

C++
1
star
35

SplitNet

Jupyter Notebook
1
star