• Stars
    star
    192
  • Rank 200,884 (Top 4 %)
  • Language
    Python
  • License
    Other
  • Created almost 2 years ago
  • Updated 11 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

[CVPR 2023] Real-Time Neural Light Field on Mobile Devices

Real-Time Neural Light Field on Mobile Devices

Project | ArXiv | PDF

This repository is for the real-time neural rendering introduced in the following CVPR'23 paper:

Real-Time Neural Light Field on Mobile Devices
Junli Cao 1, Huan Wang 2, Pavlo Chemerys1, Vladislav Shakhrai1, Ju Hu1, Yun Fu 2, Denys Makoviichuk1, Sergey Tulyakov 1, Jian Ren 1

1 Snap Inc. 2 Northeastern University

Abstract

Recent efforts in Neural Rendering Fields (NeRF) have shown impressive results on novel view synthesis by utilizing implicit neural representation to represent 3D scenes. Due to the process of volumetric rendering, the inference speed for NeRF is extremely slow, limiting the application scenarios of utilizing NeRF on resource-constrained hardware, such as mobile devices. Many works have been conducted to reduce the latency of running NeRF models. However, most of them still require high-end GPU for acceleration or extra storage memory, which is all unavailable on mobile devices. Another emerging direction utilizes the neural light field (NeLF) for speedup, as only one forward pass is performed on a ray to predict the pixel color. Nevertheless, to reach a similar rendering quality as NeRF, the network in NeLF is designed with intensive computation, which is not mobile-friendly. In this work, we propose an efficient network that runs in real-time on mobile devices for neural rendering. We follow the setting of NeLF to train our network. Unlike existing works, we introduce a novel network architecture that runs efficiently on mobile devices with low latency and small size, i.e., saving 15x ~ 24x storage compared with MobileNeRF. Our model achieves high-resolution generation while maintaining real-time inference for both synthetic and real-world scenes on mobile devices, e.g., 18.04ms (iPhone 13) for rendering one 1008x756 image of real 3D scenes. Additionally, we achieve similar image quality as NeRF and better quality than MobileNeRF (PSNR 26.15 vs. 25.91 on the real-world forward-facing dataset)

Overview

This repo contains the codebases for both the teacher and student models. We use the public repo ngp_pl as the teacher for more efficient pseudo data distillation(instead of NeRF and MipNeRF as discussed in the paper).

Observed differences between ngp and NeRF teacher:

  1. the training with ngp_pl should be less than 15 mins with 4 GPUs and pseudo data distillation for 10k images is less than 2 hours with single GPU.
  2. ngp renders high quality synthetic scenes than NeRF
  3. no space contraction techniques were employed in ngp, thus having a inferior performance on real-world scenes

Installation

conda virtual environment is recommended. The experiments were conducted on 4 Nvidia V100 GPUs. Training on one GPU should work but takes longer to converge.

MobileR2L

git clone https://github.com/snap-research/MobileR2L.git

cd MobileR2L

conda create -n r2l python==3.9
conda activate r2l
conda install pip

pip install torch torchvision torchaudio
pip install -r requirements.txt 

conda deactivate

NGP_PL

cd model/teacher/ngp_pl

# create the conda env
conda create -n ngp_pl python==3.9
conda activate ngp_pl
conda install pip

# install torch with cuda 116
pip install torch==1.13.1+cu116 torchvision==0.14.1+cu116 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu116

# install tiny-cuda-nn
pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch

# install torch scatter
pip install torch-scatter -f https://data.pyg.org/whl/torch-1.13.0+${cu116}.html

# ---install apex---
git clone https://github.com/NVIDIA/apex
cd apex
# denpency for apex
pip install packaging

## if pip >= 23.1 (ref: https://pip.pypa.io/en/stable/news/#v23-1) which supports multiple `--config-settings` with the same key... 
pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" ./
## otherwise
# pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --global-option="--cpp_ext" --global-option="--cuda_ext" ./
# ---end installing apex---


cd ../
# install other requirements
pip install -r requirements.txt

# build
pip install models/csrc/

# go to root
cd ../../../

Dataset

Download the example data: lego and fern

sh script/download_example_data.sh

Training the Teacher

cd model/teacher/ngp_pl

export ROOT_DIR=../../../dataset/nerf_synthetic/
python3 train.py \
     --root_dir $ROOT_DIR/lego \
     --exp_name lego\
     --num_epochs 30 --batch_size 16384 --lr 2e-2 --eval_lpips --num_gpu 4 

or running the bash script

sh benchmarking/benchmark_synthetic_nerf.sh lego

Once we have the teacher trained(checkpoints saved already), we can start to generate the pseudo data for MobileR2L. Depending your disk storage, the number of pseudo images could range from 2,000 to 10,000(performance varies!). Here, we set the number to 5000.

export ROOT_DIR=../../../dataset/nerf_synthetic/
python3 train.py \
    --root_dir $ROOT_DIR/lego \
    --exp_name Lego_Pseudo  \
    --save_pseudo_data \
    --n_pseudo_data 5000 --weight_path ckpts/nerf/lego/epoch=29_slim.ckpt \
    --save_pseudo_path Pseudo/lego --num_gpu 1

or running the bash script

sh benchmarking/distill_nerf.sh lego

Training MobileR2L

# go to the MobileR2L directory
cd ../../../MobileR2L

conda activate r2l

# use 4 gpus for training: NeRF
sh script/benchmarking_nerf.sh 4 lego

# use 4 gpus for training: LLFF
sh script/benchmarking_llff.sh 4 orchids

The model will be running a day or two depending on you GPUs. When the model converges, it will automatically export the onnx files to the Experiment/Lego_** folder. There should be three onnx files: Sampler.onnx, Embedder.onnx and *_SnapGELU.onnx.

Alternatively, you can export the onnx manully by running the flowing script with --ckpt_dir replaced by the trained model:

sh script/export_onnx_nerf.sh lego path/to/ckpt

Run AR lens in Snapchat

We provide the snapcodes for the AR lens in Snapchat. Scan it with Snapchat and try it out! Note: full-resolution lens need iPhone 13 or above to run smoothly in Snapchat. Try to reduce to a smaller resolution for other phones.

Future Plan

We are working on releasing a tutorial on how to utilize our method to create your own AR assets and lens that is fully compatiable with SnapML.

Acknowledgement

In this code we refer to the following implementations: nerf-pytorch, R2L and ngp_pl. We also refer to some great implementation from torch-ngp and MipNeRF. Great thanks to them! Our code is largely built upon their wonderful implementation. We also greatly thank the anounymous CVPR'23 reviewers for the constructive comments to help us improve the paper.

Reference

If our work or code helps you, please consider to cite our paper. Thank you!

@inproceedings{cao2023real,
  title={Real-Time Neural Light Field on Mobile Devices},
  author={Cao, Junli and Wang, Huan and Chemerys, Pavlo and Shakhrai, Vladislav and Hu, Ju and Fu, Yun and Makoviichuk, Denys and Tulyakov, Sergey and Ren, Jian},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={8328--8337},
  year={2023}
}

More Repositories

1

articulated-animation

Code for Motion Representations for Articulated Animation paper
Jupyter Notebook
1,210
star
2

EfficientFormer

EfficientFormerV2 [ICCV 2023] & EfficientFormer [NeurIPs 2022]
Python
972
star
3

NeROIC

Python
909
star
4

HyperHuman

[ICLR 2024] Github Repo for "HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion"
HTML
489
star
5

Panda-70M

[CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Python
459
star
6

MoCoGAN-HD

[ICLR 2021 Spotlight] A Good Image Generator Is What You Need for High-Resolution Video Synthesis
Python
240
star
7

3dgp

3D generation on ImageNet [ICLR 2023]
Python
207
star
8

MMVID

[CVPR 2022] Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning
Python
194
star
9

R2L

[ECCV 2022] R2L: Distilling Neural Radiance Field to Neural Light Field for Efficient Novel View Synthesis
Python
189
star
10

CAT

[CVPR 2021] Teachers Do More Than Teach: Compressing Image-to-Image Models (CAT)
Python
180
star
11

discoscene

CVPR 2023 Highlight: DiscoScene
Python
138
star
12

3DVADER

Source code for the paper: "AutoDecoding Latent 3D Diffusion Models"
132
star
13

BitsFusion

118
star
14

SnapFusion

HTML
95
star
15

F8Net

[ICLR 2022 Oral] F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization
Python
95
star
16

SF-V

This respository contains the code for SF-V: Single Forward Video Generation Model.
82
star
17

AToM

Official implementation of `AToM: Amortized Text-to-Mesh using 2D Diffusion`
82
star
18

graphless-neural-networks

[ICLR 2022] Code for Graph-less Neural Networks: Teaching Old MLPs New Tricks via Distillation (GLNN)
Python
75
star
19

MLPInit-for-GNNs

[ICLR 2023] MLPInit: Embarrassingly Simple GNN Training Acceleration with MLP Initialization
Jupyter Notebook
69
star
20

unsupervised-volumetric-animation

The repository for paper Unsupervised Volumetric Animation
Python
67
star
21

non-contrastive-link-prediction

[ICLR 2023] Link Prediction with Non-Contrastive Learning
Python
26
star
22

linkless-link-prediction

[ICML 2023] Linkless Link Prediction via Relational Distillation
Python
18
star
23

locomo

Python
15
star
24

LargeGT

Graph Transformers for Large Graphs
Python
13
star
25

efficient-nn-tutorial

Page for the CVPR 2023 Tutorial - Efficient Neural Networks: From Algorithm Design to Practical Mobile Deployments
HTML
13
star
26

weights2weights

Official Implementation of weights2weights
12
star
27

SpFDE

[NeurIPs 2022] Layer Freezing & Data Sieving: Missing Pieces of a Generic Framework for Sparse Training
11
star
28

representations-for-creativity

HTML
7
star
29

hpdm

Hierarchical Patch Diffusion Models for High-Resolution Video Synthesis [CVPR 2024]
HTML
7
star
30

video-synthesis-tutorial

HTML
5
star
31

snap-research-website

https://research.snap.com/
HTML
2
star
32

promptable-game-models

2
star
33

NeurT-FDR

NeurT-FDR, a method for controlling false discovery rate by incorporating feature hierarchy
Python
2
star
34

qfar

Official implementation of MobiCom 2023 paper "QfaR: Location-Guided Scanning of Visual Codes from Long Distances"
Python
1
star
35

cabam-graph-generation

[KDD MLG'20] Class-Assortative Barabasi Albert Model for Graph Generation
Jupyter Notebook
1
star
36

cv-call-for-interns-2022

HTML
1
star
37

NodeDup

Node Duplication Improves Cold-start Link Prediction
Python
1
star
38

SPAD

Source code for paper "SPAD: Spatially Aware Multi-View Diffusers"
1
star
39

snapvideo

HTML
1
star