Fast_Portrait_Segmentation
Fast (aimed to "real time") Portrait Segmentation at mobile phone
This project is not normal semantic segmentation but focus on real-time protrait segmentation.All the experimentals works with pytorch.
I hope to find a effcient network which can run on mobile phone. Currently, successfull application of person body/protrait segmentation can be find in APP like SNOW&B612, whose technology is proposed by a Korea company Nalbi.
Models
-
mobilenet_dilate_unet[1][2][7][9]
Encoder : mobilenet_v2(os: 32)
Decoder : unet(concat low level feature) use dilate convolution at different stage(d = 2, 6, 12, 18)
-
Shuffle_Seg_SkipNet[4][10][18]
Encoder : shufflenet
Decoder : skip connection (add low level feature)
-
esp_dense_seg[20][10][15][19]
-
residualdense_bisenet[15][23][24]
Attention model is a potential module in the segmentation task. I use a very light residual-dense net as the backbone of the Context Path. The details about fussion of last features in Contxt Path is not clear in the paper(BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation).
-
Segmentation + Matting [7][12][15]
Hard segmentation + Soft matting.(coming soon)
mobile_phone_human_matting
update 2019/04/10: The code and pre_trained model of final version of the portrait_segmentation is released ! ! !Speed Analysis
β‘ Real-time ! ! !
Platform : ncnn.
Mobile phone: Samsung Galaxy S8+(cpu).
model size (M) | time(ms) | |
---|---|---|
model_seg_matting | 3.3 | ~40 |
update : 2018/12/27: Demo video on my iphone 6 (baiduyun)
Result Examples
HUAWEI Mate 20 released recently can keep color on human and make the bacgrand gray in real time (click to view ). I test my model using cpu on my MAC, getting some videos here.
References
papers
- [1] MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
- [2] MobileNetV2: Inverted Residuals and Linear Bottlenecks
- [3] ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
- [4] ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design
- [5] CondenseNet: An Efficient DenseNet using Learned Group Convolutions
- [6] Xception: Deep Learning with Depthwise Separable Convolutions
- [7] U-Net: Convolutional Networks for Biomedical Image Segmentation
- [8] Rethinking Atrous Convolution for Semantic Image Segmentation
- [9] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
- [10] Fully Convolutional Networks for Semantic Segmentation
- [11] Automatic Portrait Segmentation for Image Stylization
- [12] Fast Deep Matting for Portrait Animation on Mobile Phone
- [13] DenseASPP for Semantic Segmentation in Street Scenes
- [14] Learning a Discriminative Feature Network for Semantic Segmentation
- [15] ERFNet: Efficient Residual Factorized ConvNet for Real-time Semantic Segmentation
- [16] ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation
- [17] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
- [18] ICNet for Real-Time Semantic Segmentation on High-Resolution Image
- [19] ShuffleSeg: Real-time Semantic Segmentation Network
- [20] ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation
- [21] Efficient Semantic Segmentation using Gradual Grouping
- [22] Analysis of efficient CNN design techniques for semantic segmentation
- [23] Dual Attention Network for Scene Segmentation
- [24] BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation
- [25] CCNet: Criss-Cross Attention for Semantic Segmentation
- [26] ESPNetv2: A Light-weight, Power Efficient, and General Purpose Convolutional Neural Network