• Stars
    star
    202
  • Rank 189,624 (Top 4 %)
  • Language
    Python
  • Created over 4 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Adding positional encoding to the input preserves sharp edges in the image

nerf2D

nerf2D is a 2D toy illustration of the Neural Radiance Fields. It shows how adding the gamma encoding (also referred to as positional encoding and Eq. 4 in the NeRF paper) improves results significantly.

The task is to reconstruct an image (pixel colour values) from its 2D coordinates. The dataset consists of tuples ((x, y), (r, g, b)) where the input is (x, y) and output is (r, g, b). We train a 2 layer MLP with relu activations to map (x, y) to (r, g, b). The input is normalised (as also mentioned in the paper) to range [-1, 1] and we also output in range [-1, 1]. The purpose of this 2D illustration is to show that lifting the input observation (x, y) to higher dimensions via these transformations (via gamma encoding) makes it easier for network to learn things. Training with raw (x, y) results in blurry reconstructions while adding gamma encoding shows dramatic improvements in the results i.e. it is able to preserve the sharp edges in the image.

equation

The sin plots for various values of L are:

Sin-Plots

The corresponding cos plots are:

Cos-Plots

Below, we show results with and without positional encoding. We use 2 layer MLPs each with 128 features with ReLU activations. The left image is the dataset image, the middle is the reconstruction using positional encoding and the right is the reconstruction with just raw (x, y). The flickering in the images is due to renormalisation of (r, g, b) from [-1, 1] to [0, 255] at every epoch. Note that the network that uses (x, y) as input is hardly able to get any high frequency details in the results.

In the positional encoding we use L=10 for most of the cases, but for higher frequency reconstructions this number could be increased. This largely varies from image to image so this should be treated as a hyper-parameter. This positional encoding bears a lot of resemeblance to the famous Random Fourier Features in the paper from Rahimi & Recht. In this particular case of positional encoding used in the NeRF work that we implemented, we have features computed at different scales and a phase shift of pi/2. In our experiments, we found both scale and phase shift to be very important.

The repo also has code for experiments with sawtooth and RBF features with scale and phase shift.

Glasses Image

Image Credits: http://hof.povray.org/glasses.html

Glasses

Cool Cows Image

Image Credits: http://hof.povray.org/vaches.html

Cool Cows

House Image

Image Credits: http://hof.povray.org/dhouse39.html

House

Training with more layers

We also trained with 8 layers of MLPs each with 128 features and ReLU and BatchNorm.

More-Layers

Requirements

tensorflow 2.0
opencv-python
python 3.6

Contact

Ankur Handa (handa (dot) ankur (at) gmail (dot) com)

More Repositories

1

gvnn

gvnn: Geometric Vision with Neural Networks
Lua
440
star
2

sunrgbd-meta-data

train test labels for sunrgbd
MATLAB
166
star
3

SceneNetv1.0

Still a work in progress and adding code..
C++
134
star
4

nyuv2-meta-data

all the meta data needed for nyuv2
105
star
5

simkinect

Simulating Kinect Noise: adding noise to clean depth-maps rendered with a graphics engine.
Python
69
star
6

DeformationLoopClosure

Deformation Loop Closure sample code to enable non-rigid alignment of point clouds
C++
55
star
7

robot-assets

A repository of various URDFs and assets needed for robot manipulation
Python
37
star
8

tf-unet

tensorflow version of unet
Python
29
star
9

sim2realAI

We are indexing the progress in simulations to real world transfer for perception and control
28
star
10

pytorch-SceneNetRGBD

Implementation of UNet as used in SceneNet RGB-D paper
Python
21
star
11

implicit-bc-2d

implicit behaviour cloning toy 2d example
Python
13
star
12

dexpilot

paper on dexpilot
13
star
13

blender_scripts

scripts to convert model formats with blender
Python
11
star
14

TVL1Denoising

Experiments on TVL1 denoising
C++
10
star
15

HTC-Vive-Setup-Ubuntu

Step by step guide to setting up HTC Vive on Ubuntu
9
star
16

SceneGraphRendering

rendering rgb-d frames and more
C++
6
star
17

tf-new-op-example

tensorflow op example
C++
4
star
18

VLP16DataReader

Data Grabber for VLP16 and displays the point cloud in OpenGL window
CMake
4
star
19

tinyobjloader

Minimal version of tinyobjloader I found
C++
3
star
20

OffScreenDepthRender

Offscreen rendering of depth maps using OpenGL with Pangolin
C++
3
star
21

living_room_iclnuim

ICL-NUIM dataset living room code
Shell
2
star
22

rand_conv

applying random convolutions to an image
Python
2
star
23

imageutilities

modified version of the imageutilities GPU library from TU Graz
C++
2
star
24

ankurhanda.github.io

website
HTML
2
star
25

alignYAxisWithGravity

aligning the floor normals with the gravity vector based on http://www.cs.berkeley.edu/~sgupta/pdf/GuptaArbelaezMalikCVPR13.pdf
C++
2
star
26

stanford-scene-database

Automatically exported from code.google.com/p/stanford-scene-database
C
1
star
27

ORB_SLAM_Pangolin

C++
1
star
28

matplotlib_utils

utility functions for quick matplotlib plotting
Python
1
star
29

scenenet-rgbd-notes

1
star
30

py_contact_models

Python
1
star
31

libcvd

popular image library back in ancient times
C++
1
star
32

python_plotting

python utils for plotting
Python
1
star
33

Pangolin-local

local copy of pangolin
C++
1
star
34

Superresolution

Super resolution paper implemented from Markus Unger and Pock's variational optimisation technique
C++
1
star
35

caffe-dense-img-seg

caffe per pixel segmentation
C++
1
star
36

MagmaCublasTestFunctions

A repository of codes to get comfortable with Magma and Cublas
C
1
star