• Stars
    star
    229
  • Rank 173,728 (Top 4 %)
  • Language
    TeX
  • Created over 2 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A comprehensive list [AIM@IJCAI'21, P3M@MM'21, GFM@IJCV'22, RIM@CVPR'23, P3MNet@IJCV'23] of our research works related to image matting, including papers, codes, datasets, demos, and citations. Note: The repo for [IJCV'23] "Rethinking Portrait Matting with Privacy Preserving" has been moved to: https://github.com/ViTAE-Transformer/P3M-Net

Image Matting

This repo contains a comprehensive list of our research works related to image matting, including papers, codes, datasets, demos, and citations. For any related questions, please contact Jizhizi Li at [email protected] and Sihan Ma at [email protected].


🚀 News

[2023-04-10]: Publish the paper Deep Image Matting: A Comprehensive Survey on arXiv.

[2023-03-28]: The paper Rethinking Portrait Matting with Privacy Preserving has been accepted by the International Journal of Computer Vision (IJCV) 🎉

[2023-02-28]: The paper Referring Image Matting has been accepted by the Computer Vision and Pattern Recognition Conference (CVPR) 🎉


Overview

1. Deep Image Matting: A Comprehensive Survey, arXiv, 2023  

2. Rethinking Portrait Matting with Privacy Preserving, IJCV, 2023  

3. Referring Image Matting, CVPR, 2023  

4. Bridging Composite and Real: Towards End-to-end Deep Image Matting, IJCV, 2022  

5. Privacy-preserving Portrait Matting, ACM MM, 2021  

6. Deep Automatic Natural Image Matting, IJCAI, 2021  

Projects

📘 Deep Image Matting: A Comprehensive Survey [arXiv-2023]

Jizhizi Li, Jing Zhang, and Dacheng Tao

Paper | Github Code | BibTex

Image matting refers to extracting precise alpha matte from natural images, and it plays a critical role in various downstream applications, such as image editing. The emergence of deep learning has revolutionized the field of image matting and given birth to multiple new techniques, including automatic, interactive, and referring image matting. Here we present a comprehensive review of recent advancements in image matting in the era of deep learning.


📘 Rethinking Portrait Matting with Privacy Preserving [IJCV-2023]

Sihan Ma∗, Jizhizi Li∗, Jing Zhang, He Zhang, and Dacheng Tao. (*equal contribution)

Paper | Github Code | Dataset | Demo | BibTex

This paper introduces three variants of P3M-Net based on both transformer and CNN backbones to solve the portrait matting problem with privacy preserving. Also a simple yet effective Copy and Paste strategy (P3M-CP) is devised to enable the matting model to process both face-blurred and normal images without extra effort during inference.


📘 Referring Image Matting [CVPR-2023]

Jizhizi Li, Jing Zhang, and Dacheng Tao

Paper | Github Code | Dataset | BibTex

Image matting refers to extracting the accurate foregrounds in the image. Current automatic methods tend to extract all the salient objects in the image indiscriminately. In this paper, we propose a new task named Referring Image Matting (RIM), referring to extracting the meticulous alpha matte of the specific object that can best match the given natural language description. We then propose a large-scale dataset RefMatte and a carefully designed method CLIPMat to serve as a baseline suite for RIM. We believe the new task RIM along with the RefMatte dataset and the method CLIPMat will open new research directions in this area and facilitate future studies.


📘 Bridging Composite and Real: Towards End-to-end Deep Image Matting [IJCV-2022]

Jizhizi Li1∗, Jing Zhang1∗, Stephen J. Maybank, and Dacheng Tao. (*equal contribution)

Paper | Github Code | Dataset | Demo | BibTex

We propose a novel Glance and Focus Matting network (GFM), which employs a shared encoder and two separate decoders to learn both tasks in a collaborative manner for end-to-end image matting. We also establish a novel Animal Matting dataset (AM-2k) to serve for end-to-end matting task. Furthermore, we investigate the domain gap issue between composition images and natural images systematically, propose a carefully designed composite route RSSN and a large-scale high-resolution background dataset (BG-20k) to serve as better candidates for composition.


📘 Privacy-Preserving Portrait Matting [ACM MM-21]

Jizhizi Li∗, Sihan Ma∗, Jing Zhang, and Dacheng Tao. (*equal contribution)

Paper | Github Code | Dataset | BibTex

This work presents P3M-10k, which is the first large-scale anonymized benchmark for Privacy-Preserving Portrait Matting, to solve the increasing concerns about the privacy in image matting. They also propose P3M-Net, which leverages the power of a unified framework for both semantic perception and detail matting, and specifically emphasizes the interaction between them and the encoder to facilitate the matting process.


📘 Deep Automatic Natural Image Matting [IJCAI-21]

Jizhizi Li, Jing Zhang, and Dacheng Tao

Paper | Github Code | Dataset | BibTex

We investigate the difficulties when extending the automatic matting methods to natural images with salient transparent/meticulous foregrounds or non-salient foregrounds by proposing a novel end-to-end matting network, which can predict a generalized trimap for any image of the above types as a unified semantic representation and simultaneously guide the matting network to focus on the transition areas via an attention mechanism. We also construct a test set AIM-500 that contains 500 diverse natural images covering all types along with manually labeled alpha mattes, making it feasible to benchmark the generalization ability of AIM models.

More Repositories

1

ViTPose

The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation" and [TPAMI'23] "ViTPose++: Vision Transformer for Generic Body Pose Estimation"
Python
1,313
star
2

ViTDet

Unofficial implementation for [ECCV'22] "Exploring Plain Vision Transformer Backbones for Object Detection"
Python
524
star
3

ViTAE-Transformer-Remote-Sensing

A comprehensive list [SAMRS@NeurIPS'23, RVSA@TGRS'22, RSP@TGRS'22] of our research works related to remote sensing, including papers, codes, and citations. Note: The repo for [TGRS'22] "An Empirical Study of Remote Sensing Pretraining" has been moved to: https://github.com/ViTAE-Transformer/RSP
TeX
446
star
4

Remote-Sensing-RVSA

The official repo for [TGRS'22] "Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model"
Python
403
star
5

SAMRS

The official repo for [NeurIPS'23] "SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model"
Python
263
star
6

ViTAE-Transformer

The official repo for [NeurIPS'21] "ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias" and [IJCV'22] "ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image Recognition and Beyond"
Python
249
star
7

QFormer

The official repo for [TPAMI'23] "Vision Transformer with Quadrangle Attention"
Python
158
star
8

ViTAE-VSA

The official repo for [ECCV'22] "VSA: Learning Varied-Size Window Attention in Vision Transformers"
Python
152
star
9

MTP

The official repo for [JSTARS'24] "MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining"
Python
140
star
10

RSP

The official repo for [TGRS'22] "An Empirical Study of Remote Sensing Pretraining"
Python
130
star
11

P3M-Net

The official repo for [IJCV'23] "Rethinking Portrait Matting with Privacy Preserving"
Python
90
star
12

DeepSolo

[CVPR 2023] DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting
Python
68
star
13

ViTAE-Transformer-Scene-Text-Detection

The official repo for [IJCV'22] I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection
Python
37
star
14

LeMeViT

The official repo for [IJCAI'24] "LeMeViT: Efficient Vision Transformer with Learnable Meta Tokens for Remote Sensing Image Interpretation"
Python
37
star
15

SimDistill

The official repo for [AAAI 2024] "SimDistill: Simulated Multi-modal Distillation for BEV 3D Object Detection""
Python
22
star
16

VOS-LLB

The official repo for [AAAI'23] "Learning to Learn Better for Video Object Segmentation"
Python
10
star
17

APTv2

The official repo for the extension of [NeurIPS'22] "APT-36K: A Large-scale Benchmark for Animal Pose Estimation and Tracking": https://github.com/pandorgan/APT-36K
Python
9
star
18

I3CL

The official repo for [IJCV'22] "I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection"
Python
2
star