• Stars
    star
    106
  • Rank 325,871 (Top 7 %)
  • Language
    Jupyter Notebook
  • License
    Other
  • Created almost 2 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Nerf and extensions in all

ArcNerf

ArcNerf is a framework consist of many state-of-the-art NeRF-based methods with useful functionality on novel view rendering and object extraction.

The framework is highly modular, which allows you to modify any component in the pipeline and develop your own algorithm easily.

It is developed by TencentARC Lab.

nerf


News

[2023.02] We provide HDR-NeRF implementation and benchmark on HDR-Real dataset. We also provide a simple guidance for adding new algorithm/datasets to the pipeline. See guidance


What is special in this project?

In the recent few months, many frameworks working on common NeRF-based pipeline have been proposed:

(And a very nice start-up company Luma.ai is working on NeRF based real-time rendering on 3d scene.)

All those amazing works are trying to bring those state-of-the-art NeRF-based methods together into a complete, modular framework that is easy to change any of the components and conduct experiment quickly.

Toward the same goal, we are working on those fields could make this project helpful to the community:

  • Here is the framework overview. Notice that not all the designed feature have been implemented in this framework(eg. Traditional MVS Branch). We are working on extending it in the coming future.

pipe

  • Highly modular design of pipeline:
    • Every field is seperated, and you can plug in any new developed module under the framework. Those fields can be easily controlled by the config files and modified without harming others.
    • We provide both sdf model, background model for modeling object and background as well, which are not commonly provided in other repo.

pipe

  • Based on this pipeline, we can easily extend original nerf/neus model for:

    • NeRF with pruning volume, with freq embed
    • NeRF with hashGrid embed
    • NeRF with hashGrid embed + volume pruning -> NGP model
    • NeuS with pruning volume, with freq embed
    • NeuS with hashGrid embed
    • NeuS with hashGrid embed + volume pruning -> NeuS-NGP model
  • Plug-in the modules like volume pruning makes the speed faster and generate better result. See expr for more detail.

  • foreground_model and background_model are separate well. They each bound the modeling in in-out area by sphere, volume and other geometry structure. It is suitable for common daily captured video. which provide high quality object mesh and background rendering at the same time

  • Unified dataset and benchmark:

    • We split the dataset following on official repo, and all methods are running under the same settings for fair comparison.
    • We also make unittests for the datasets and you are easy to check whether the configs on the data is correct.

data

  • Many useful functionality are provided:

    • Mesh extraction on Density Model or SDF Model. (We are still working on incorporating better extraction functions to collect Assets for Modern Graphic Engine)
    • Colmap preparation on your own capture data.
    • surface rendering on the sdf model
    • plentiful geometry functions implemented in torch backend.
    • For other functions on the trainer and logging, please ref doc.
  • Render render

  • Extraction obj tex

  • Docs and Code:

    • All the functions are with detailed docs on its usage, and the operation are commented with its tensor size, which makes you easy to understand the change of components.
    • We have implemented helpful geometry function on mesh/rays/sphere/volume/cam_pose in torch(some in CUDA extension). It could be useful for you in other 3D-related projects as well.
    • We also provide our experiments note on our trails.
  • Tests and Visual helpers:

    • We have developed an interactive visualizer to easily tests the correctness of our geometry function. It is compatible with torch/np arrays.
    • We have written a lots of unittest on the geometry and modelling functions. Take a look, and you will be easy to understand how to use the visualizer for checking your own implementation.

cam volume ray

We are still working on many other helpful functions for better rendering and extraction, please ref todo for more details.

Bring issues to us if you have any suggestion!


Installation

Get the repo by git clone https://github.com/TencentARC/ArcNerf --recursive

  • Install libs by pip install -r requirements.txt.
  • Install the customized ops by sh scripts/install_ops.sh. (Only for volume sampling/pruning, etc)
  • Install tiny-cuda-nn modules by sh scripts/install_tinycudann.sh. (Only for fusemlp/hashgrid encode, etc)

We test on env with:

  • GPU: NVIDIA-A100 with CUDA 11.1 (Lower version may harm the tinycudann module).
  • cmake: 3.21.3 (>=3.21)
  • gcc: 8.3.1 (>=5.4)
  • python: 3.8.5 (>=3.7)
  • torch: 1.9.1

Colmap

Colmap is used to estimate camera locations and sparse point cloud. It will let you run the algorithm on your own data.

Install under their instruction.

Branch

Two branch simplenerf/simplengp contains the model for vanilla nerf and instant-ngp only with less complicated model design.


Usage

Data Preparation

  • Download and prepare public datasets ref to instruction.
  • If you use you own captured data, scripts/data_process.sh will help you extract the frames and estimate the camera.

Train

Train by python train.py --configs configs/default.yaml --gpu_ids 0.

  • --gpu_ids -1 will use cpu, which is good for you to debug the code in local IDE like pycharm line by line without a GPU device.
  • for more details on the config, go to default.yaml for more details.
  • add --resume path/to/model can resume training from checkpoints. Model will be saved periodically to forbid unpredictable error.
  • For more detail of the training pipeline, visit common_trainer and trainer.

Evaluate

Eval by python evaluate.py --configs configs/eval.yaml --gpu_ids 0. You can set your target model by --model_pt path/to/model.

Inference

Inference makes customized rendering video and extract mesh output. Run by python inference.py --configs configs/eval.yaml --gpu_ids 0. You can set your target model by --model_pt path/to/model.

Notebook

Some notebooks are provided for you to understand what is happening for inference and how to use our visualizer. Go to notebook for more details.


Datasets and Benchmarks

All the datasets inherit the same data class for ray generation, img/mask preparation. What you need to do in a new class is to read image/camera poses under different mode split. The details are here.

Self-Capture data and Colmap

We support customized data by Capture dataset. You can record a clip of video around the object and run the pre-process script under scripts/data_process.sh. Notice that colmap results is highly correlated to your filming status. A clear, stable video with full angle towards the object could bring more accurate result.

Visual helper

We put all the dataset configs under conf, and you can check the result by running the unittest tests/tests_arcnerf/tests_dataset/tests_any_dataset.py and see the results.

For any dataset that we have provided, you should check the configs like this to ensure the setting are correct.

ray_pc cam_pc pts_pc

Benchmark

See benchmark for details, and expr for some trails we have conducted and some problems we have meet.

rgb mask


Full Benchmark on NeRF synthetic dataset

All image with white-bkg, same as the eval in vanilla NeRF.

chair drums ficus hotdog lego materials mic ship avg
nerf 33.30 25.11 30.47 36.73 32.86 29.87 33.24 28.70 31.285
nerf(paper) 33.00 25.01 30.13 36.18 32.54 29.62 32.91 28.65 31.043
ngp 34.88 25.50 30.55 36.92 35.38 29.12 34.80 28.39 31.942
ngp(paper) 34.28 25.70 33.13 36.99 36.12 29.35 35.67 30.61 32.731
  • ngp paper reports PSNR with black background, which is higher than using white background.

Models

The models are highly modular and are in level-structure. You are free to modify components at each level by configs, or easily develop new algorithm and plug it in the desired place.

For more detail on the structure of model class, visit model and understand each component of it.

model


Geometry

We implement plentiful geometry functions in torch under geometry. The operation are batch-based, and their correctness are check under unittests.

You can see the doc for more details and know what is support.


Offline Visualization

We make a offline interactive 3d visualizer in plotly backend. All the geometry components in numpy tensor could be easily plugin the visualizer. It is compatible to torch-template 3d projects and helpful for you to debug your implementation of the geometric functions.

We provide a notebook showing the example of usage. You can ref the doc for more details.

There is also another repo contain this visualizer. Please go to ArcVis if you find it is helpful.


Code and Tests

We have made many unittests for checking the geometry function and models. See doc to know how to test and get visual results.

We suggest to you make your own unittests if you are developing new algorithms to ensure correctness.

Comments in code are also helpful for you to learn how to use the function and the change of tensor size.


Trainer

We use our own training pipeline, which provides many customized functionality. It is modular and easy to add/modify and part of the training pipeline.

We have another repo common_trainer. Or you can ref the doc for more information.


Web Viewer

Thanks to the very powerful nerfstudio viewer, we adopt arcnerf to it and it can easily show the training and eval result of model from this project.

ns_viewer

To use it in training, you can follow train_cfgs to add the viewer confs in viewer_cfg. You can directly start training as usually by python train.py --configs configs/path_to_config.

After training, you can use the online viewer to visualize the result. You can call python tools/vis_ns_viewer.py --configs configs/path_to_config --model_pt path_to_model to visualize the dataset and rendering outputs. You can also vis the fg_model by setting --viewer.show_fg_only True.

For how to use the viewer, please visit their website at doc.

Export cam path

You can also add infer cams using their render panel, which export a json file. You can then set inference.render.json_path as the file location, our pipeline will render the video with such customized cam path.

export

We may develop the viewer more compatible with this project in the future, for example, adding more function to object level rendering or manipulation. Thanks to the authors of nerfstudio again!


License

Check LICENSE.


Acknowledgements

Please see Citation. Thanks to those amazing projects.

If you find this project useful, please consider citing:

@misc{arcnerf,
  author={Yue Luo, Yan-Pei Cao},
  title={arcnerf: nerf-based object/scene rendering and extraction framework },
  url={https://github.com/TencentARC/arcnerf/},
  year={2022},

You can contact the author by [email protected] if you need any help.

More Repositories

1

GFPGAN

GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
Python
35,397
star
2

PhotoMaker

PhotoMaker [CVPR 2024]
Jupyter Notebook
9,198
star
3

T2I-Adapter

T2I-Adapter
Python
3,377
star
4

InstantMesh

InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models
Python
2,928
star
5

BrushNet

[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"
Python
1,298
star
6

MotionCtrl

Official Code for MotionCtrl [SIGGRAPH 2024]
Python
1,247
star
7

MasaCtrl

[ICCV 2023] Consistent Image Synthesis and Editing
Python
699
star
8

SEED-Story

SEED-Story: Multimodal Long Story Generation with Large Language Model
Python
657
star
9

LLaMA-Pro

[ACL 2024] Progressive LLaMA with Block Expansion.
Python
459
star
10

Mix-of-Show

NeurIPS 2023, Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models
Python
383
star
11

Open-MAGVIT2

Open-MAGVIT2: Democratizing Autoregressive Visual Generation
Python
376
star
12

AnimeSR

Codes for "AnimeSR: Learning Real-World Super-Resolution Models for Animation Videos"
Python
325
star
13

VQFR

ECCV 2022, Oral, VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder
Python
320
star
14

CustomNet

Python
258
star
15

SmartEdit

Official code of SmartEdit [CVPR-2024 Highlight]
Python
214
star
16

UMT

UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or highlight detection results.
Python
186
star
17

MM-RealSR

Codes for "Metric Learning based Interactive Modulation for Real-World Super-Resolution"
Python
152
star
18

ViT-Lens

[CVPR 2024] ViT-Lens: Towards Omni-modal Representations
Python
148
star
19

MCQ

Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).
Python
136
star
20

DeSRA

Official codes for DeSRA (ICML 2023)
Python
123
star
21

FAIG

NeurIPS 2021, Spotlight, Finding Discriminative Filters for Specific Degradations in Blind Super-Resolution
Python
118
star
22

ST-LLM

[ECCV 2024🔥] Official implementation of the paper "ST-LLM: Large Language Models Are Effective Temporal Learners"
Python
96
star
23

SurfelNeRF

SurfelNeRF: Neural Surfel Radiance Fields for Online Photorealistic Reconstruction of Indoor Scenes
76
star
24

RepSR

Codes for "RepSR: Training Efficient VGG-style Super-Resolution Networks with Structural Re-Parameterization and Batch Normalization"
74
star
25

mllm-npu

mllm-npu: training multimodal large language models on Ascend NPUs
Python
67
star
26

HOSNeRF

HOSNeRF: Dynamic Human-Object-Scene Neural Radiance Fields from a Single Video
Python
65
star
27

FastRealVSR

Codes for "Mitigating Artifacts in Real-World Video Super-Resolution Models"
59
star
28

ConMIM

Official codes for ConMIM (ICLR 2023)
Python
57
star
29

GVT

Official code for "What Makes for Good Visual Tokenizers for Large Language Models?".
Python
54
star
30

TVTS

Turning to Video for Transcript Sorting
Jupyter Notebook
44
star
31

BEBR

Official code for "Binary embedding based retrieval at Tencent"
Python
42
star
32

ViSFT

Python
33
star
33

pi-Tuning

Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.
Python
32
star
34

FLM

Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)
Python
31
star
35

Efficient-VSR-Training

Codes for "Accelerating the Training of Video Super-Resolution"
30
star
36

DTN

Official code for "Dynamic Token Normalization Improves Vision Transformer", ICLR 2022.
Python
27
star
37

OpenCompatible

OpenCompatible provides a standard compatible training benchmark, covering practical training scenarios.
Python
24
star
38

BTS

BTS: A Bi-lingual Benchmark for Text Segmentation in the Wild
23
star
39

SGAT4PASS

This is the official implementation of the paper SGAT4PASS: Spherical Geometry-Aware Transformer for PAnoramic Semantic Segmentation (IJCAI 2023)
Python
23
star
40

SFDA

Python
20
star
41

TaCA

Official code for the paper, "TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter".
15
star
42

Plot2Code

Python
14
star
43

common_trainer

Common template for pytorch project. Easy to extent and modify for new project.
Python
12
star
44

TransFusion

The code repo for the ACM MM paper: TransFusion: Multi-Modal Fusion for Video Tag Inference viaTranslation-based Knowledge Embedding.
9
star
45

BasicVQ-GEN

7
star
46

ArcVis

Visualization of 3d and 2d components interactively.
Jupyter Notebook
6
star
47

VTLayout

3
star