• Stars
    star
    166
  • Rank 227,748 (Top 5 %)
  • Language
  • Created over 1 year ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Deep Image Matting: A Comprehensive Survey

Deep Image Matting: A Comprehensive Survey

This is the official repository of the paper Deep Image Matting: A Comprehensive Survey.

Jizhizi Li, Jing Zhang, and Dacheng Tao1
1 The University of Sydney, Sydney, Australia

Introduction | Preliminary | Methods | Datasets | Benchmark | Statement

Introduction

Image matting refers to extracting precise alpha matte from natural images, and it plays a critical role in various downstream applications, such as image editing. The emergence of deep learning has revolutionized the field of image matting and given birth to multiple new techniques, including automatic, interactive, and referring image matting. Here we present a comprehensive review of recent advancements in image matting in the era of deep learning by focusing on two fundamental sub-tasks: auxiliary input-based image matting.

Preliminary

Image matting, which refers to the precise extraction of the soft matte from foreground objects in arbitrary images, has been extensively studied for several decades. The process can be described mathematically as below, where I represents the input image, F represents the foreground image, and B represents the background image. The opacity of the pixel in the foreground is denoted by ฮฑi, which ranges from 0 to 1. We also show the typical input image, ground truth alpha matte and various auxiliary inputs such as trimap, background, coarse map, user clicks, scribbles, and a text description in the following figure. The text description for this image can be the cute smiling brown dog in the middle of the image.

Image Matting Methods

We compile a timeline of the developments in deep learning-based image matting methods as follows.

We also list a summary of image matting methods organized according to the year of publication, the publication venue, input modality, automaticity, matting target, architecture, and availability of the code (with the link). The list of papers is chronologically ordered. Please note that [U] stands for the unofficial implementation of the code.

Year Method Pub. Input Auto. Target Arch. Code
2016 Deep automatic portrait matting (DAPM) ECCV RGB โœ“ human Sequential two-step CNN -
Natural image matting using deep convolutional neural networks (DCNN) ECCV RGB-Coarse โœ— object One-stage CNN -
2017 Deep image matting (DIM) CVPR RGB-Trimap โœ— object One-stage CNN+Refine Github[U]
Fast deep matting for portrait animation on mobile phone (FDM) MM RGB โœ“ human Sequantial two-step CNN -
2018 Tom-Net: Learning transparent object matting from a single image (TOM-Net) CVPR RGB โœ“ trans. Sequential two-step CNN+Refine Github
Deep propagation based image matting (DMPN) IJCAI RGB-Trimap โœ— object One-stage CNN -
Alphagan: Generative adversarial networks for natural image matting (AlphaGAN) BMVC RGB-Trimap โœ— object One-stage GAN Github[U]
Semantic soft segmentation (SSS) TOG RGB โœ“ object Sequential two-stage Github
Semantic human matting (SHM) MM RGB โœ“ human Sequential two-step CNN Github[U]
Active matting (ActiveMatting) NeurIPS RGB-Click โœ— object One-stage RNN -
2019 A late fusion cnn for digital matting (LF) CVPR RGB โœ“ object Sequential two-stage CNN Github
Learning-based sampling for natural image matting (SampleNet) CVPR RGB-Trimap โœ— object Parallel three-stream CNN -
Indices matter: Learning to index for deep image matting (IndexNet) ICCV RGB-Trimap โœ— object One-stage CNN Github
Disentangled image matting (AdaMatting) ICCV RGB-Trimap โœ— object Parallel two-stream CNN+refine -
Context-aware image matting for simultaneous foreground and alpha estimation (Context-Aware) ICCV RGB-Trimap โœ— object Two-stream CNN Github
2020 Natural image matting via guided contextual attention (GCA) AAAI RGB-Trimap โœ— object One-stage CNN Github
Background matting: The world is your green screen (BM) CVPR RGB-Bg โœ— human Parallel four-stream CNN Github
Hierarchical opacity propagation for image matting (HOP) arXiv RGB-Trimap โœ— object Parallel two-stream CNN Github
Boosting semantic human matting with coarse annotations (SHMC) CVPR RGB โœ“ human Sequential two-stage CNN -
F, b, alpha matting (FBA) arXiv RGB-Trimap โœ— object One-stage CNN Github
Attention-guided hierarchical structure aggregation for image matting (HAtt) CVPR RGB โœ“ object One-stage CNN -
High-resolution deep image matting (HDMatt) AAAI RGB-Trimap โœ— object Parallel two-stream CNN -
Bridging composite and real: towards end-to-end deep image matting (GFM) IJCV RGB โœ“ human, animal Parallel two-stream CNN Github
Modnet: Real-time trimap-free portrait matting via objective decomposition (MODNet) AAAI RGB โœ“ human Parallel two-stream CNN Github
Learning affinity-aware upsampling for deep image matting๏ผˆA2U๏ผ‰ CVPR RGB-Trimap โœ— object One-stage CNN Github
Mask guided matting via progressive refinement network (MGMatting) CVPR RGB-Coarse โœ— human One-stage CNN Github
Improved image matting via real-time user clicks and uncertainty estimation (InteractiveMatting) CVPR RGB-Click โœ— object Parallel two-stream CNN -
Smart scribbles for image matting (SmartScribbles) TOMM RGB-Scribble โœ— object One-stage CNN -
Real-Time High-Resolution Background Matting (BMV2) CVPR RGB-Bg โœ— human One-stage CNN+refine Github
2021 Towards enhancing fine-grained details for image matting (FDMatting) WACV RGB-Trimap โœ— object Two-stream CNN -
Semantic image matting (SIM) CVPR RGB-Trimap โœ— object One-stage CNN Github
Privacy-preserving portrait matting (P3M-Net) MM RGB โœ“ human Parallel two-stream CNN Github
Cascade image matting with deformable graph refinement (CasDGR) ICCV RGB โœ“ object Parallel two-stream CNN -
Deep Automatic Natural Image Matting (AIM-Net) IJCAI RGB โœ“ object Parallel two-stream CNN Github
Long-range feature propagating for natural image matting (LFPNet) MM RGB-Trimap โœ— object Parallel two-stream CNN Github
Virtual Multi-Modality Self-Supervised Foreground Matting for Human-Object Interaction (VMFM) ICCV RGB โœ“ human-object Sequential two-stage CNN -
Tripartite Information Mining and Integration for Image Matting (TIMI-Net) ICCV RGB-Trimap โœ— object Parallel three-stream CNN Github
Deep Image Matting with Flexible Guidance Input (FGI) BMVC RGB-Flexible โœ— object One-stage CNN Github
Highly efficient natural image matting (HEMatting) BMVC RGB โœ“ object Sequential two-stage CNN -
2022 Boosting Robustness of Image Matting With Context Assembling and Strong Data Augmentation (Rmat) CVPR RGB-Trimap โœ— object Parallel two-stream CNN/Transformer -
Deep interactive image matting with feature propagation (DIIM) TIP RGB-Click โœ— object One-stage CNN -
User-Guided Deep Human Image Matting Using Arbitrary Trimaps (UGDMatting) TIP RGB-Flexible โœ— human Parallel two-stream CNN -
Image matting with deep gaussian process (matting-GP) TNNLS RGB-Trimap โœ— object One-stage CNN -
Rethinking portrait matting with privacy preserving (P3M-ViTAE) IJCV RGB โœ“ human Parallel two stream CNN/Transformer Github
Situational Perception Guided Image Matting (SPG-IM) MM RGB โœ“ object Sequential two-stage CNN -
Human instance matting via mutual guidance and multi-instance refinement (HIM) CVPR RGB โœ“ human Sequential two-stage CNN Github
MatteFormer: Transformer-Based Image Matting via Prior-Tokens (MatteFormer) CVPR RGB-Trimap โœ— object One-stage CNN/Transformer Github
Referring image matting (RIM) CVPR RGB-Language โœ— object One-stage CNN Github
TransMatting: Enhancing Transparent Objects Matting with Transformers (TransMatting) ECCV RGB-Trimap โœ— trans. One-stage CNN/Transformer Github

Image Matting Datasets

We list a summary of the image matting datasets, categorized as the synthetic image-based benchmark, natural image-based benchmark, and test sets. The datasets are ordered based on their release date and are described in terms of publication venue, naturalness, matting target, resolution, number of training and test samples, and availability (along with their links). It should be noted that the size of the datasets is calculated based on the number of distinguished foregrounds, except for TOM and RefMatte, which have pre-defined composite rules.

Name Pub. Natural Target Resolution #Train #Test Publicity
DIM-481 CVPR'17 โœ— object 1298ร—1083 431 50 Link
TOM CVPR'18 โœ— transparent - 178,000 876 Link
LF-257 CVPR'19 โœ— human 553ร—756 228 29 Link
HATT-646 CVPR'20 โœ— object 1573ร—1731 596 60 Link
PhotoMatte13k CVPR'20 โœ— human - 13665 - -
SIM CVPR'21 โœ— object 2194ร—1950 348 50 Link
Human-2k ICCV'21 โœ— human 2112ร—2075 2000 100 Link
Trans-460 ECCV'22 โœ— transparent 3766ร—3820 410 50 Link
HIM2k CVPR'22 โœ— human 1823ร—1424 1500 500 Link
RefMatte CVPR'23 โœ— object 1543ร—1162 45000 2500 Link
AlphaMatting CVPR'09 โœ“ object 3056ร—2340 27 8 Link
DAPM-2k ECCV'16 โœ“ human 600ร—800 1700 300 Link
SHM-35k MM'18 โœ“ human - 52511 1400 -
SHMC-10k CVPR'20 โœ“ human - 9324 125 -
P3M-10k MM'21 โœ“ human 1349ร—1321 9421 1000 Link
AM-2k IJCV'22 โœ“ animal 1471ร—1195 1800 200 Link
Multi-Object-1k MM'22 โœ“ human-object - 1000 200 -
UGD-12k TIP'22 โœ“ human 356ร—317 12066 700 Link
PhotoMatte85 CVPR'20 โœ— human 2304ร—3456 - 85 Link
AIM-500 IJCAI'21 โœ“ object 1397ร—1260 - 500 Link
RWP-636 CVPR'21 โœ“ human 1038ร—1327 - 636 Link
PPM-100 AAAI'22 โœ“ human 2997ร—2875 - 100 Link

Performance Benchmarking

We provide a comprehensive evaluation of representative matting methods in the paper. Here, we present some subjective results of auxiliary-based matting methods on alphamatting.com and automatic matting methods on P3M-500-NP.

Statement

If you are interested in our work, please consider citing the following:

@article{li2023deep,
  title={Deep Image Matting: A Comprehensive Survey},
  author={Jizhizi Li and Jing Zhang and Dacheng Tao},
  journal={ArXiv},
  year={2023},
  volume={abs/2304.04672}
}

This project is under the MIT license. For further questions, please contact Jizhizi Li at [email protected].

Relevant Projects

[1] Deep Automatic Natural Image Matting, IJCAI, 2021 | Paper | Github
โ€‚ โ€‚ โ€‚Jizhizi Li, Jing Zhang, and Dacheng Tao

[2] Privacy-preserving Portrait Matting, ACM MM, 2021 | Paper | Github
โ€‚ โ€‚ โ€‚Jizhizi Liโˆ—, Sihan Maโˆ—, Jing Zhang, Dacheng Tao

[3] Bridging Composite and Real: Towards End-to-end Deep Image Matting, IJCV, 2022 | Paper | Github
โ€‚ โ€‚ โ€‚Jizhizi Liโˆ—, Jing Zhangโˆ—, Stephen J. Maybank, Dacheng Tao

[4] Referring Image Matting, CVPR, 2023 | Paper | Github
โ€‚ โ€‚ โ€‚Jizhizi Li, Jing Zhang, and Dacheng Tao

[5] Rethinking Portrait Matting with Privacy Preserving, IJCV, 2023 | Paper | Github
โ€‚ โ€‚ โ€‚Sihan Maโˆ—, Jizhizi Liโˆ—, Jing Zhang, He Zhang, Dacheng Tao

More Repositories

1

GFM

[IJCV 2022] Bridging Composite and Real: Towards End-to-end Deep Image Matting
Python
917
star
2

AIM

[IJCAI'21] Deep Automatic Natural Image Matting
Python
386
star
3

P3M

[ACM MM 2021] Privacy-Preserving Portrait Matting
Python
286
star
4

RIM

[CVPR 2023] Referring Image Matting
201
star
5

animal-matting

34
star
6

pytorch-sss

python & pytorch implementation of paper Semantic Soft Segmentation
Jupyter Notebook
21
star
7

cnn-cell-counting

A machine learning model using CNN to count number of cells in a medical image
Python
11
star
8

homography-transformation-python

Performs homography transformation between two images, write in python
Python
6
star
9

all-kinds-crawling-tools

This repository provides all kinds of crawling tools, e.g. image-crawler, paper-crawler
Python
5
star
10

person-detector

A machine learning model built on flask framework, mostly uses openCV to detect person in photo, can be used in single or batch images
Python
4
star
11

Cell_Image_Processing-Cell_Sequence_Tracking

MATLAB
4
star
12

generate-close-view

Generate close view and paste on original image with any ratio, can be used in writing papers or making slides
Python
3
star
13

codebrew2015

Java
2
star
14

files

2
star
15

ml-algorithms

Some common used machine learning algorithms
Python
2
star
16

Approximate-Matching

Python
2
star
17

GOY-similarBrand

a simple nlp algorithm to find similar brand for Good On You app
Python
1
star
18

MyTime

android app for sharing/uploading/editing photos
Java
1
star
19

Online-Folder-Backup

1
star
20

BetterTrip

CSS
1
star
21

Pairwise-Relationship-Prediction

MATLAB
1
star
22

Fashion-Hackday-20171112

code used in fashion-hackday @20171112
Python
1
star
23

CellCounting_CNN

CNN model build to count the cell from microscopy image(Master's Thesis)
Jupyter Notebook
1
star