Deep Image Matting: A Comprehensive Survey

This is the official repository of the paper Deep Image Matting: A Comprehensive Survey.

Jizhizi Li, Jing Zhang, and Dacheng Tao¹

1 The University of Sydney, Sydney, Australia

Introduction

Image matting refers to extracting precise alpha matte from natural images, and it plays a critical role in various downstream applications, such as image editing. The emergence of deep learning has revolutionized the field of image matting and given birth to multiple new techniques, including automatic, interactive, and referring image matting. Here we present a comprehensive review of recent advancements in image matting in the era of deep learning by focusing on two fundamental sub-tasks: auxiliary input-based image matting.

Preliminary

Image matting, which refers to the precise extraction of the soft matte from foreground objects in arbitrary images, has been extensively studied for several decades. The process can be described mathematically as below, where I represents the input image, F represents the foreground image, and B represents the background image. The opacity of the pixel in the foreground is denoted by α_i, which ranges from 0 to 1. We also show the typical input image, ground truth alpha matte and various auxiliary inputs such as trimap, background, coarse map, user clicks, scribbles, and a text description in the following figure. The text description for this image can be the cute smiling brown dog in the middle of the image.

Image Matting Methods

We compile a timeline of the developments in deep learning-based image matting methods as follows.

We also list a summary of image matting methods organized according to the year of publication, the publication venue, input modality, automaticity, matting target, architecture, and availability of the code (with the link). The list of papers is chronologically ordered. Please note that [U] stands for the unofficial implementation of the code.

Year	Method	Pub.	Input	Auto.	Target	Arch.	Code
2016	Deep automatic portrait matting (DAPM)	ECCV	RGB	✓	human	Sequential two-step CNN	-
2016	Natural image matting using deep convolutional neural networks (DCNN)	ECCV	RGB-Coarse	✗	object	One-stage CNN	-
2017	Deep image matting (DIM)	CVPR	RGB-Trimap	✗	object	One-stage CNN+Refine	Github[U]
2017	Fast deep matting for portrait animation on mobile phone (FDM)	MM	RGB	✓	human	Sequantial two-step CNN	-
2018	Tom-Net: Learning transparent object matting from a single image (TOM-Net)	CVPR	RGB	✓	trans.	Sequential two-step CNN+Refine	Github
	Deep propagation based image matting (DMPN)	IJCAI	RGB-Trimap	✗	object	One-stage CNN	-
	Alphagan: Generative adversarial networks for natural image matting (AlphaGAN)	BMVC	RGB-Trimap	✗	object	One-stage GAN	Github[U]
	Semantic soft segmentation (SSS)	TOG	RGB	✓	object	Sequential two-stage	Github
	Semantic human matting (SHM)	MM	RGB	✓	human	Sequential two-step CNN	Github[U]
	Active matting (ActiveMatting)	NeurIPS	RGB-Click	✗	object	One-stage RNN	-
2019	A late fusion cnn for digital matting (LF)	CVPR	RGB	✓	object	Sequential two-stage CNN	Github
	Learning-based sampling for natural image matting (SampleNet)	CVPR	RGB-Trimap	✗	object	Parallel three-stream CNN	-
	Indices matter: Learning to index for deep image matting (IndexNet)	ICCV	RGB-Trimap	✗	object	One-stage CNN	Github
	Disentangled image matting (AdaMatting)	ICCV	RGB-Trimap	✗	object	Parallel two-stream CNN+refine	-
	Context-aware image matting for simultaneous foreground and alpha estimation (Context-Aware)	ICCV	RGB-Trimap	✗	object	Two-stream CNN	Github
2020	Natural image matting via guided contextual attention (GCA)	AAAI	RGB-Trimap	✗	object	One-stage CNN	Github
	Background matting: The world is your green screen (BM)	CVPR	RGB-Bg	✗	human	Parallel four-stream CNN	Github
	Hierarchical opacity propagation for image matting (HOP)	arXiv	RGB-Trimap	✗	object	Parallel two-stream CNN	Github
	Boosting semantic human matting with coarse annotations (SHMC)	CVPR	RGB	✓	human	Sequential two-stage CNN	-
	F, b, alpha matting (FBA)	arXiv	RGB-Trimap	✗	object	One-stage CNN	Github
	Attention-guided hierarchical structure aggregation for image matting (HAtt)	CVPR	RGB	✓	object	One-stage CNN	-
	High-resolution deep image matting (HDMatt)	AAAI	RGB-Trimap	✗	object	Parallel two-stream CNN	-
	Bridging composite and real: towards end-to-end deep image matting (GFM)	IJCV	RGB	✓	human, animal	Parallel two-stream CNN	Github
	Modnet: Real-time trimap-free portrait matting via objective decomposition (MODNet)	AAAI	RGB	✓	human	Parallel two-stream CNN	Github
	Learning affinity-aware upsampling for deep image matting（A2U）	CVPR	RGB-Trimap	✗	object	One-stage CNN	Github
	Mask guided matting via progressive refinement network (MGMatting)	CVPR	RGB-Coarse	✗	human	One-stage CNN	Github
	Improved image matting via real-time user clicks and uncertainty estimation (InteractiveMatting)	CVPR	RGB-Click	✗	object	Parallel two-stream CNN	-
	Smart scribbles for image matting (SmartScribbles)	TOMM	RGB-Scribble	✗	object	One-stage CNN	-
	Real-Time High-Resolution Background Matting (BMV2)	CVPR	RGB-Bg	✗	human	One-stage CNN+refine	Github
2021	Towards enhancing fine-grained details for image matting (FDMatting)	WACV	RGB-Trimap	✗	object	Two-stream CNN	-
	Semantic image matting (SIM)	CVPR	RGB-Trimap	✗	object	One-stage CNN	Github
	Privacy-preserving portrait matting (P3M-Net)	MM	RGB	✓	human	Parallel two-stream CNN	Github
	Cascade image matting with deformable graph refinement (CasDGR)	ICCV	RGB	✓	object	Parallel two-stream CNN	-
	Deep Automatic Natural Image Matting (AIM-Net)	IJCAI	RGB	✓	object	Parallel two-stream CNN	Github
	Long-range feature propagating for natural image matting (LFPNet)	MM	RGB-Trimap	✗	object	Parallel two-stream CNN	Github
	Virtual Multi-Modality Self-Supervised Foreground Matting for Human-Object Interaction (VMFM)	ICCV	RGB	✓	human-object	Sequential two-stage CNN	-
	Tripartite Information Mining and Integration for Image Matting (TIMI-Net)	ICCV	RGB-Trimap	✗	object	Parallel three-stream CNN	Github
	Deep Image Matting with Flexible Guidance Input (FGI)	BMVC	RGB-Flexible	✗	object	One-stage CNN	Github
	Highly efficient natural image matting (HEMatting)	BMVC	RGB	✓	object	Sequential two-stage CNN	-
2022	Boosting Robustness of Image Matting With Context Assembling and Strong Data Augmentation (Rmat)	CVPR	RGB-Trimap	✗	object	Parallel two-stream CNN/Transformer	-
	Deep interactive image matting with feature propagation (DIIM)	TIP	RGB-Click	✗	object	One-stage CNN	-
	User-Guided Deep Human Image Matting Using Arbitrary Trimaps (UGDMatting)	TIP	RGB-Flexible	✗	human	Parallel two-stream CNN	-
	Image matting with deep gaussian process (matting-GP)	TNNLS	RGB-Trimap	✗	object	One-stage CNN	-
	Rethinking portrait matting with privacy preserving (P3M-ViTAE)	IJCV	RGB	✓	human	Parallel two stream CNN/Transformer	Github
	Situational Perception Guided Image Matting (SPG-IM)	MM	RGB	✓	object	Sequential two-stage CNN	-
	Human instance matting via mutual guidance and multi-instance refinement (HIM)	CVPR	RGB	✓	human	Sequential two-stage CNN	Github
	MatteFormer: Transformer-Based Image Matting via Prior-Tokens (MatteFormer)	CVPR	RGB-Trimap	✗	object	One-stage CNN/Transformer	Github
	Referring image matting (RIM)	CVPR	RGB-Language	✗	object	One-stage CNN	Github
	TransMatting: Enhancing Transparent Objects Matting with Transformers (TransMatting)	ECCV	RGB-Trimap	✗	trans.	One-stage CNN/Transformer	Github

Image Matting Datasets

We list a summary of the image matting datasets, categorized as the synthetic image-based benchmark, natural image-based benchmark, and test sets. The datasets are ordered based on their release date and are described in terms of publication venue, naturalness, matting target, resolution, number of training and test samples, and availability (along with their links). It should be noted that the size of the datasets is calculated based on the number of distinguished foregrounds, except for TOM and RefMatte, which have pre-defined composite rules.

Name	Pub.	Natural	Target	Resolution	#Train	#Test	Publicity
DIM-481	CVPR'17	✗	object	1298×1083	431	50	Link
TOM	CVPR'18	✗	transparent	-	178,000	876	Link
LF-257	CVPR'19	✗	human	553×756	228	29	Link
HATT-646	CVPR'20	✗	object	1573×1731	596	60	Link
PhotoMatte13k	CVPR'20	✗	human	-	13665	-	-
SIM	CVPR'21	✗	object	2194×1950	348	50	Link
Human-2k	ICCV'21	✗	human	2112×2075	2000	100	Link
Trans-460	ECCV'22	✗	transparent	3766×3820	410	50	Link
HIM2k	CVPR'22	✗	human	1823×1424	1500	500	Link
RefMatte	CVPR'23	✗	object	1543×1162	45000	2500	Link
AlphaMatting	CVPR'09	✓	object	3056×2340	27	8	Link
DAPM-2k	ECCV'16	✓	human	600×800	1700	300	Link
SHM-35k	MM'18	✓	human	-	52511	1400	-
SHMC-10k	CVPR'20	✓	human	-	9324	125	-
P3M-10k	MM'21	✓	human	1349×1321	9421	1000	Link
AM-2k	IJCV'22	✓	animal	1471×1195	1800	200	Link
Multi-Object-1k	MM'22	✓	human-object	-	1000	200	-
UGD-12k	TIP'22	✓	human	356×317	12066	700	Link
PhotoMatte85	CVPR'20	✗	human	2304×3456	-	85	Link
AIM-500	IJCAI'21	✓	object	1397×1260	-	500	Link
RWP-636	CVPR'21	✓	human	1038×1327	-	636	Link
PPM-100	AAAI'22	✓	human	2997×2875	-	100	Link

Performance Benchmarking

We provide a comprehensive evaluation of representative matting methods in the paper. Here, we present some subjective results of auxiliary-based matting methods on alphamatting.com and automatic matting methods on P3M-500-NP.

Statement

If you are interested in our work, please consider citing the following:

@article{li2023deep,
  title={Deep Image Matting: A Comprehensive Survey},
  author={Jizhizi Li and Jing Zhang and Dacheng Tao},
  journal={ArXiv},
  year={2023},
  volume={abs/2304.04672}
}

This project is under the MIT license. For further questions, please contact Jizhizi Li at [email protected].

Relevant Projects

[1] Deep Automatic Natural Image Matting, IJCAI, 2021 | Paper | Github
Jizhizi Li, Jing Zhang, and Dacheng Tao

[2] Privacy-preserving Portrait Matting, ACM MM, 2021 | Paper | Github
Jizhizi Li^∗, Sihan Ma^∗, Jing Zhang, Dacheng Tao

[3] Bridging Composite and Real: Towards End-to-end Deep Image Matting, IJCV, 2022 | Paper | Github
Jizhizi Li^∗, Jing Zhang^∗, Stephen J. Maybank, Dacheng Tao

[4] Referring Image Matting, CVPR, 2023 | Paper | Github
Jizhizi Li, Jing Zhang, and Dacheng Tao

[5] Rethinking Portrait Matting with Privacy Preserving, IJCV, 2023 | Paper | Github
Sihan Ma^∗, Jizhizi Li^∗, Jing Zhang, He Zhang, Dacheng Tao

JizhiziLi/matting-survey

JizhiziLi

Reviews

Repository Details

Deep Image Matting: A Comprehensive Survey

This is the official repository of the paper Deep Image Matting: A Comprehensive Survey.

Jizhizi Li, Jing Zhang, and Dacheng Tao¹

1 The University of Sydney, Sydney, Australia

Introduction

Preliminary

Image Matting Methods

Image Matting Datasets

Performance Benchmarking

Statement

Relevant Projects

More Repositories

JizhiziLi/matting-survey

JizhiziLi

Reviews

Repository Details

Deep Image Matting: A Comprehensive Survey

This is the official repository of the paper Deep Image Matting: A Comprehensive Survey.

Jizhizi Li, Jing Zhang, and Dacheng Tao1

1 The University of Sydney, Sydney, Australia

Introduction

Preliminary

Image Matting Methods

Image Matting Datasets

Performance Benchmarking

Statement

Relevant Projects

More Repositories

Jizhizi Li, Jing Zhang, and Dacheng Tao¹