• Stars
    star
    121
  • Rank 293,924 (Top 6 %)
  • Language
    Python
  • License
    Other
  • Created over 2 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

EgocentricReconstruction

Egocentric Scene Reconstruction from an Omnidirectional Video

Project page | Paper | Presentation | 360 RGBD Dataset

Hyeonjoong Jang, Andreas Meuleman, Dahyun Kang, Donggun Kim, Christian Richardt, Min H. Kim
KAIST, University of Bath

In this repository, we provide the code for the paper 'Egocentric Scene Reconstruction from an Omnidirectional Video' , reviewed and presented at SIGGRAPH 2022. teaser

Tested Environments

  • OS: Ubuntu 21.10
  • Graphic card: TITAN RTX
  • Cuda toolkit version: 11.1
  • NVIDIA Driver Version: 495.44
  • python: 3.9.7
  • torch version: 1.9.0+cu111
  • CMake: >3.17

Installation

Dependencies

Install dependencies by running the following code.

sudo apt-get install libgles2-mesa-dev
sudo apt-get install libboost-all-dev
sudo apt install libomp-dev
sudo apt-get install python-opengl
sudo apt-get install libeigen3-dev

Conda Environments

Create a conda environment by running the following code.

conda env create -f environment.yml
conda activate egocentric
pip3 install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html
pip install scipy
pip install sklearn

RAFT & Dual Marching Cube

First, download and install OpenCV (https://opencv.org/) and Cuda toolkit 11.1 (https://developer.nvidia.com/cuda-11.1.0-download-archive)

Our Binoctree implementation is based on RAFT (https://github.com/princeton-vl/RAFT) and dmc (https://github.com/planaria/dmc).

Clone the repositories and build by running the following code.

It will clone the repositories and replace modified codes with files in delta folder.

sh setup.sh

How To Run Demo

sponza

python main.py --config demo/config/demo.json

Our code is controlled by values specified in a json file. We provide a sample json file demo/config/demo.json.

Explanations of each step.

Step 0: Mask Generation

  • It will create a window and show the input video repeatedly.
  • Drag and drop inside the window several times to mask out unwanted regions such as a photographer.
  • Press 's' to end the step.

Step 1: Pose Estimation (SLAM)

Step 2: Depth Estimation

  • It will create a demo/depth/sponza_depth folder and save depth maps (inverse depth).

Step 3: Weight calculation

  • It will create a demo/depth/sponza_weight folder and save weight maps.

Step 4: Mesh Generation

  • It will create a mesh folder and save .stl and voxel_center.obj files.
  • voxel_center.obj visualizes the centers of each voxel (binoctree node) with two colors: red for positive SDF values, blue for negative SDF values.

Step 5: Texture Mapping

  • It will create an OpenGL window to visualize a textured mesh.
  • Press 'P' to stop camera movement and manually move the camera.
  • Drag with mouse left button to rotate viewing direction.
  • Click with mouse right button to reset the viewing direction.
  • Translate the camera with '1', '3', '4', '6', '2', '8' keys to move forward, backward, left, right, downward, and upward.
  • Press 'C' to capture and save an image.
  • Press 'D' to visualize depth.

Parameters

  • step:"2345" will make the program run step 2, 3, 4, and 5 in sequence.
  • video_name:"sponza" points the name of the folder that has video.mp4 in it.
  • base_pth: finetuned RAFT checkpoint path.
  • depth_scale: Scale of depth map image size compared to the input video.
  • sample_ratio: Every sample_ratio frame of the video will be used for depth estimation.
  • second_neighbor_size: Number of neighbor frames will be used for depth estimation.
  • second_neighbor_interval: Frame intervals between the neighbor frames.
  • mask_radius: The ratio (percentage of width) of mask region near the baseline axis. [0.1,0.2] should be fine.
  • solid_angle: The solid-angle threshold for binoctree subdivision stop condition. Tsolid in our paper.
  • frame_interval: Every frame_interval depth map will be used for mesh generation (TSDF accumulation).
  • min_depth: rnear in our paper.
  • max_depth: rfar in our paper.
  • sky_depth_threshold: Depth values that are larger than sky_depth_threshold will be ignored.
  • truncation: TSDF truncation slope. em in our paper.
  • truncation_offset: TSDF truncation offset en in our paper.
  • truncation_change: Additional variable for adaptive truncation.
  • weight_trunc_std: Additional weight for adaptive truncation. Smaller value will give more weights for closer depth estimates.
  • node_sample_interval: Every node_sample_interval pixel values of depth maps will be used for mesh generation (TSDF accumulation). Value of 1 will use all depth estimates.
  • min_sdf_cnt: Only voxels (binoctree nodes) that was updated more than min_sdf_cnt will be used for mesh generation (dual marching cubes). From 0 to 5 should be OK for most cases.
  • sclae_simple, scale_bbox, spline_num, rendering_num: Parameters of camera movement for visualization.
  • key_frame_num: Number of representative input frames that will be used for texture mapping.
  • result_rendering_record: Set 1 to record a video.
  • result_rendering_record_one_cycle_only: Record once only.
  • blend_closest_C_frames: Variable for development. Always set 1.
  • result_rendering_frame_step: Every result_rendering_frame_step frames will be used for rendering.
  • texture_frame_idx_list_step: Every texture_frame_idx_list_step input frames will be used for texture mapping
  • stl_name: Saving name of output mesh (stl_name.stl).
  • input_trajectory_name: The name of the input trajectory file (input_trajectory_name.csv).
  • input_depth_dir_postfix: The suffix of the name of the folder that contains depth maps.
  • result_rendering_trajectory_name: Camera trajectory for visualization.
  • texture_merge_take_max, texture_merge_topK, texture_merge_topKoutlier_colorthreshold: Additional parameters for texture mapping.
  • texture_map_max_sizeK: A parameter of texture map size. If you face not-enough-memory, reduce this value.
  • result_rendering_fov: Rendering field of view.
  • result_rendering_h: Rendering window height.
  • result_rendering_w: Rendering window width.
  • cube_side: The size of cubemap used for texture mapping.
  • result_rendering_initial_rotation: Rendering initial camera rotation matrix.

License

Hyeonjoong Jang, Andreas Meuleman, Dahyun Kang, Donggun Kim, and Min H. Kim have developed this software and related documentation (the "Software"); confidential use in source form of the Software, without modification, is permitted provided that the following conditions are met:

Neither the name of the copyright holder nor the names of any contributors may be used to endorse or promote products derived from the Software without specific prior written permission.

The use of the software is for Non-Commercial Purposes only. As used in this Agreement, “Non-Commercial Purpose” means for the purpose of education or research in a non-commercial organization only. “Non-Commercial Purpose” excludes, without limitation, any use of the Software for, as part of, or in any way in connection with a product (including software) or service which is sold, offered for sale, licensed, leased, published, loaned or rented. If you require a license for a use excluded by this agreement, please email [email protected].

Warranty: KAIST-VCLAB MAKES NO REPRESENTATIONS OR WARRANTIES ABOUT THE SUITABILITY OF THE SOFTWARE, EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT. KAIST-VCLAB SHALL NOT BE LIABLE FOR ANY DAMAGES SUFFERED BY LICENSEE AS A RESULT OF USING, MODIFYING OR DISTRIBUTING THIS SOFTWARE OR ITS DERIVATIVES.

Please refer to license for more details.

Acknowledgement

This project uses the following open-source libraries and repositories, please consider citing them if you use related functionalities:

BibTex

@Article{jang2022egocentric,
 author = {Hyeonjoong Jang, Andreas Meuleman, Dahyun Kang, Donggun Kim, Christian Richardt, and Min H. Kim},
    title = {Egocentric Scene Reconstruction from an Omnidirectional Video},
    journal = {ACM Transactions on Graphics (Proc. SIGGRAPH 2022)},
    month = {August},
    year = {2022},
    volume = {XX},
    number = {X},
        doi = "XXX",
        url = "XXX"
}

More Repositories

1

texturefusion

[CVPR2020] TextureFusion: High-Quality Texture Acquisition for Real-Time RGB-D Scanning
C++
119
star
2

sphere-stereo

[CVPR2021] Real-Time Sphere Sweeping Stereo from Multiview Fisheye Images
Python
72
star
3

normal-fusion

[CVPR 2021] NormalFusion: Real-Time Acquisition of Surface Normals for High-Resolution RGB-D Scanning
C++
67
star
4

deepcassi

[SIGGRAPH Asia 2017] High-Quality Hyperspectral Reconstruction Using a Spectral Prior
Python
59
star
5

DeepFormableTag

[SIGGRAPH21] DeepFormableTag: End-to-end Generation and Recognition of Deformable Fiducial Markers
Python
42
star
6

SparseEllipsometry

SparseEllipsometry
MATLAB
36
star
7

singshot-hdr-demosaicing

[ICCV 2023] Joint Demosaicing and Deghosting of Time-Varying Exposures for Single-Shot HDR Imaging
Python
18
star
8

dual-pixel-disparity

[CVPR 2023] Spatio-Focal Bidirectional Disparity Estimation from a Dual-Pixel Image
Python
16
star
9

xlrcam

[SIGGRAPH2009] Modeling Human Color Perception under Extended Luminance Levels. This code provides a color appearance model (CAM) for high-dynamic-range (HDR) imaging. This code can be used for HDR tone-mapping and inverse tone-mapping.
C++
15
star
10

laplacianinpainting

[CVPR2016] Laplacian Patch-Based Image Synthesis
C++
13
star
11

dtransient-rendering

[SIGGRAPH Asia] Differentiable Transient Rendering
C++
8
star
12

fastbirefstereo

[CVPR2020] Single-shot Monocular RGB-D Imaging using Uneven Double Refraction
C++
6
star
13

OmniSDF

[CVPR2024] OmniSDF: OmniSDF: Scene Reconstruction using Omnidirectional Signed Distance Functions and Adaptive Binoctrees
Python
6
star
14

cnnbiref

[CVPR 2021] High-Quality Stereo Image Restoration from Double Refraction
Python
6
star
15

vscatter

[SIGGRAPH Asia 2018] Practical Multiple Scattering for Rough Surfaces
C++
5
star
16

OmniLocalRF

[CVPR2024] OmniLocalRF: Omnidirectional Local Radiance Fields from Dynamic Videos
Python
3
star
17

dehazing-nnf

[VISAPP2017] Dehazing using Non-Local Regularization with Iso-Depth Neighbor-Fields
C++
2
star
18

polar-harmonics

[SIGGRAPH2024] Spin-Weighted Spherical Harmonics for Polarized Light Transport
2
star
19

nlos-inverse-rendering

[SIGGRAPH Asia 2022] Self-Calibrating, Fully Differentiable NLOS Inverse Rendering
Python
1
star