• Stars
    star
    177
  • Rank 215,985 (Top 5 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created about 6 years ago
  • Updated over 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Code for Towards Real-Time Automatic Portrait Matting on Mobile Devices

Towards Real-Time Automatic Portrait Matting on Mobile Devices

We tackle the problem of automatic portrait matting on mobile devices. The proposed model is aimed at attaining real-time inference on mobile devices with minimal degradation of model performance. Our model MMNet, based on multi-branch dilated convolution with linear bottleneck blocks, outperforms the state-of-the-art model and is orders of magnitude faster. The model can be accelerated four times to attain 30 FPS on Xiaomi Mi 5 device with moderate increase in the gradient error. Under the same conditions, our model has an order of magnitude less number of parameters and is faster than Mobile DeepLabv3 while maintaining comparable performance.

The trade-off between gradient error and latency on a mobile device. Latency is measured using a Qualcomm Snapdragon 820 MSM8996 CPU. Size of each circle is proportional to the logarithm of the number of parameters used by the model. Different circles of Mobile DeepLabv3 are created by varying the output stride and width multiplier. The circles are marked with their width multiplier. Results using 128 x 128 inputs are marked with * , otherwise, inputs are in 256 x 256. Notice that MMNet outperforms all other models forming a Pareto front. The number of parameters for LDN+FB is not reported in their paper.

Requirements

  • Python 3.6+
  • Tensorflow 1.6

Installation

git clone --recursive https://github.com/hyperconnect/MMNet.git
pip3 install -r requirements/py36-gpu.txt

Dataset

Dataset for training and evaluation has to follow directory structure as depticted below. To use other name than train and test, one can utilize --dataset_split_name argument in train.py or evaluate.py.

dataset_directory
  |___ train
  |   |__ mask
  |   |__ image
  |
  |___ test
      |__ mask
      |__ image

Training

In scripts directory, you can find example scripts for training and evaluation of MMNet and Mobile DeepLabv3. Training scripts accept two arguments: dataset path and train directory. dataset path has to point to directory with structure described in the previous section.

MMNet

Training of MMNet with depth multiplier 1.0 and input image size 256.

./scripts/train_mmnet_dm1.0_256.sh /path/to/dataset /path/to/training/directory

Mobile DeepLabv3

Training of Mobile DeepLabv3 with output stride 16, depth multiplier 0.5 and input image size 256.

./scripts/train_deeplab_os16_dm0.5_256.sh /path/to/dataset /path/to/training/directory

Evaluation

Evaluation scripts, same as training scripts, accept two arguments: dataset path and train directory. If train directory argument points to specific checkpoint file, only that checkpoint file will be evaluated, otherwise the latest checkpoint file will be evaluated. It is recommended to run evaluation scripts together with training scripts in order to get evaluation metrics for every checkpoint file.

MMNet

./scripts/valid_mmnet_dm1.0_256.sh /path/to/dataset /path/to/training/directory

Mobile DeepLabv3

./scripts/valid_deeplab_os16_dm0.5_256.sh /path/to/dataset /path/to/training/directory

Demo

Refer to demo/demo.mp4.

License

Apache License 2.0

More Repositories

1

TC-ResNet

Code for Temporal Convolution for Real-time Keyword Spotting on Mobile Devices
Python
216
star
2

HypeUI

๐ŸŒบ HypeUI is a implementation of Apple's SwiftUI DSL style based on UIKit
Swift
127
star
3

LADE

This repository contains code for the paper "Disentangling Label Distribution for Long-tailed Visual Recognition", published at CVPR' 2021
Python
96
star
4

MarioNETte

MarioNETte: Few-shot Face Reenactment Preserving Identity of Unseen Targets
39
star
5

pseudo-dialog-prompting

This repository contains code for the paper "Meet Your Favorite Character: Open-domain Chatbot Mimicking Fictional Characters with only a Few Utterances", published at NAACL' 2022
Python
12
star
6

hyperconnect.github.io

ํ•˜์ดํผ์ปค๋„ฅํŠธ ๊ธฐ์ˆ ๋ธ”๋กœ๊ทธ์ž…๋‹ˆ๋‹ค
HTML
10
star
7

trusthresh

An official codebase for the paper, "Reliable Decision from Multiple Subtasks through Threshold Optimization: Content Moderation in the Wild", WSDM'23
Python
10
star
8

FasTEN

Python
9
star
9

g2r

Codebase for the EMNLP 2021 Paper "Distilling the Knowledge of Large-scale Generative Models into Retrieval Models for Efficient Open-domain Conversation".
Python
6
star
10

TiDAL

Python
5
star
11

sem-ent

An official codebase for the paper, "Measuring and Improving Semantic Diversity of Dialogue Generation", EMNLP 2022 Findings
Python
4
star
12

corge

An official codebase for the paper "Understanding and Improving the Exemplar-based Generation for Open-domain Conversation", which is presented at ACL 2022, 4th Workshop on NLP for ConvAI as an oral paper.
Python
3
star
13

Attentron

Attentron: Few-shot Text-to-Speech Exploiting Attention-based Variable Length Embedding
1
star
14

pypipeline-tutorial

Python
1
star