• Stars
    star
    201
  • Rank 194,491 (Top 4 %)
  • Language
  • Created over 2 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

[CVPR 2023] Referring Image Matting

Referring Image Matting [CVPR-2023]




This is the official repository of the paper Referring Image Matting.

Jizhizi Li, Jing Zhang, and Dacheng Tao

Introduction | RefMatte | CLIPMat | Results | Statement


πŸš€ News

[2023-04-17]: The datasets RefMatte and RefMatte-RW100 can now be openly accessed from the links below! Please follow the dataset release agreements to access.

Dataset Dataset Link (One Drive) Size Dataset Release Agreement
RefMatte Link (pw: 3ft9cb) 43.7G Agreement (CC BY-NC License)
RefMatte-RW100 Link (pw: 3ft9cb) 66.6M Agreement (CC BY-NC License)

[2023-02-28]: The paper has been accepted by the Computer Vision and Pattern Recognition Conference (CVPR)! πŸŽ‰

Introduction

Image matting refers to extracting the accurate foregrounds in the image. Current automatic methods tend to extract all the salient objects in the image indiscriminately. In this paper, we propose a new task named Referring Image Matting (RIM), referring to extracting the meticulous alpha matte of the specific object that can best match the given natural language description. We then propose a large-scale dataset RefMatte and a carefully designed method CLIPMat to serve as a baseline suite for RIM. We believe the new task RIM along with the RefMatte dataset and the method CLIPMat will open new research directions in this area and facilitate future studies. The dataset, code, and the method will be published soon.

RefMatte and RefMatte-RW100

Prevalent visual grounding methods are all limited to the segmentation level, probably due to the lack of high-quality datasets. To fill the gap, we establish the first large-scale challenging dataset RefMatte by designing a comprehensive image composition and expression generation engine to produce synthetic images on top of current public high-quality matting foregrounds with flexible logics and re-labelled diverse attributes. RefMatte consists of 230 object categories, 47,500 images, 118,749 expression-region entities, and 474,996 expressions, which can be further extended easily in the future. Besides this, we also construct a real-world test set RefMatte-RW100 with manually generated phrase annotations consisting of 100 natural images to further evaluate the generalization of RIM models. We show some examples of RefMatte as follows, including the images, the alpha mattes and the input texts. More can be seen from this page. We have released the dataset RefMatte and RefMatte-RW100, please follow the dataset release agreements to access.

Dataset Dataset Link (One Drive) Size Dataset Release Agreement
RefMatte Link (pw: 3ft9cb) 43.7G Agreement (CC BY-NC License)
RefMatte-RW100 Link (pw: 3ft9cb) 66.6M Agreement (CC BY-NC License)

We also generate the wordcloud of the keywords, attributes and relationships in RefMatte as belows. As can be seen, the dataset has a large portion of human and animals since they are very common in the image matting task. The most frequent attributes in RefMatte are male, gray, transparent, and salient, while the relationship words are more balanced.

CLIPMat

Furthermore, we present a novel baseline method CLIPMat for RIM, including a context-embedded prompt, a text-driven semantic pop-up, and a multi-level details extractor. Extensive experiments on RefMatte in both keyword and expression settings validate the superiority of CLIPMat over representative methods. We show the diagram as follows, while more information can be viewed from the paper.

Results

We show some examples of our test results on RefMatte test set and RefMatte-RW100 by our CLIPMat given text inputs and the images under both keyword-based and expression-based setting. More can be seen from this page.

Statement

If you are interested in our work, please consider citing the following:

@inproceedings{rim,
  title={Referring Image Matting},
  author={Li, Jizhizi and Zhang, Jing and Tao, Dacheng},
  booktitle={Proceedings of the IEEE Computer Vision and Pattern Recognition},
  year={2023}
}

This project is under the CC BY-NC license. For further questions, please contact Jizhizi Li at [email protected].

Relevant Projects

[1] Deep Automatic Natural Image Matting, IJCAI, 2021 | Paper | Github
     Jizhizi Li, Jing Zhang, and Dacheng Tao

[2] Privacy-Preserving Portrait Matting, ACM MM, 2021 | Paper | Github
     Jizhizi Liβˆ—, Sihan Maβˆ—, Jing Zhang, and Dacheng Tao

[3] Bridging Composite and Real: Towards End-to-end Deep Image Matting, IJCV, 2022 | Paper | Github
     Jizhizi Liβˆ—, Jing Zhangβˆ—, Stephen J. Maybank, and Dacheng Tao

[4] Rethinking Portrait Matting with Privacy Preserving, IJCV, 2023 | Paper | Github
     Sihan Maβˆ—, Jizhizi Liβˆ—, Jing Zhang, He Zhang, and Dacheng Tao

[5] Deep Image Matting: A Comprehensive Survey, ArXiv, 2023 | Paper | Github
     Jizhizi Li, Jing Zhang, and Dacheng Tao

More Repositories

1

GFM

[IJCV 2022] Bridging Composite and Real: Towards End-to-end Deep Image Matting
Python
917
star
2

AIM

[IJCAI'21] Deep Automatic Natural Image Matting
Python
386
star
3

P3M

[ACM MM 2021] Privacy-Preserving Portrait Matting
Python
286
star
4

matting-survey

Deep Image Matting: A Comprehensive Survey
166
star
5

animal-matting

34
star
6

pytorch-sss

python & pytorch implementation of paper Semantic Soft Segmentation
Jupyter Notebook
21
star
7

cnn-cell-counting

A machine learning model using CNN to count number of cells in a medical image
Python
11
star
8

homography-transformation-python

Performs homography transformation between two images, write in python
Python
6
star
9

all-kinds-crawling-tools

This repository provides all kinds of crawling tools, e.g. image-crawler, paper-crawler
Python
5
star
10

person-detector

A machine learning model built on flask framework, mostly uses openCV to detect person in photo, can be used in single or batch images
Python
4
star
11

Cell_Image_Processing-Cell_Sequence_Tracking

MATLAB
4
star
12

generate-close-view

Generate close view and paste on original image with any ratio, can be used in writing papers or making slides
Python
3
star
13

codebrew2015

Java
2
star
14

files

2
star
15

ml-algorithms

Some common used machine learning algorithms
Python
2
star
16

Approximate-Matching

Python
2
star
17

GOY-similarBrand

a simple nlp algorithm to find similar brand for Good On You app
Python
1
star
18

MyTime

android app for sharing/uploading/editing photos
Java
1
star
19

Online-Folder-Backup

1
star
20

BetterTrip

CSS
1
star
21

Pairwise-Relationship-Prediction

MATLAB
1
star
22

Fashion-Hackday-20171112

code used in fashion-hackday @20171112
Python
1
star
23

CellCounting_CNN

CNN model build to count the cell from microscopy image(Master's Thesis)
Jupyter Notebook
1
star