• Stars
    star
    1,227
  • Rank 37,157 (Top 0.8 %)
  • Language
    Python
  • License
    MIT License
  • Created about 3 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

This repository contains the code for the CVPR 2021 paper "GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields"

GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields

Project Page | Paper | Supplementary | Video | Slides | Blog | Talk

Add Clevr Tranlation Horizontal Cars Interpolate Shape Faces

If you find our code or paper useful, please cite as

@inproceedings{GIRAFFE,
    title = {GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields},
    author = {Niemeyer, Michael and Geiger, Andreas},
    booktitle = {Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR)},
    year = {2021}
}

TL; DR - Quick Start

Rotating Cars Tranlation Horizontal Cars Tranlation Horizontal Cars

First you have to make sure that you have all dependencies in place. The simplest way to do so, is to use anaconda.

You can create an anaconda environment called giraffe using

conda env create -f environment.yml
conda activate giraffe

You can now test our code on the provided pre-trained models. For example, simply run

python render.py configs/256res/cars_256_pretrained.yaml

This script should create a model output folder out/cars256_pretrained. The animations are then saved to the respective subfolders in out/cars256_pretrained/rendering.

Usage

Datasets

To train a model from scratch or to use our ground truth activations for evaluation, you have to download the respective dataset.

For this, please run

bash scripts/download_dataset.sh

and following the instructions. This script should download and unpack the data automatically into the data/ folder.

Controllable Image Synthesis

To render images of a trained model, run

python render.py CONFIG.yaml

where you replace CONFIG.yaml with the correct config file. The easiest way is to use a pre-trained model. You can do this by using one of the config files which are indicated with *_pretrained.yaml.

For example, for our model trained on Cars at 256x256 pixels, run

python render.py configs/256res/cars_256_pretrained.yaml

or for celebA-HQ at 256x256 pixels, run

python render.py configs/256res/celebahq_256_pretrained.yaml

Our script will automatically download the model checkpoints and render images. You can find the outputs in the out/*_pretrained folders.

Please note that the config files *_pretrained.yaml are only for evaluation or rendering, not for training new models: when these configs are used for training, the model will be trained from scratch, but during inference our code will still use the pre-trained model.

FID Evaluation

For evaluation of the models, we provide the script eval.py. You can run it using

python eval.py CONFIG.yaml

The script generates 20000 images and calculates the FID score.

Note: For some experiments, the numbers in the paper might slightly differ because we used the evaluation protocol from GRAF to fairly compare against the methods reported in GRAF.

Training

Finally, to train a new network from scratch, run

python train.py CONFIG.yaml

where you replace CONFIG.yaml with the name of the configuration file you want to use.

You can monitor on http://localhost:6006 the training process using tensorboard:

cd OUTPUT_DIR
tensorboard --logdir ./logs

where you replace OUTPUT_DIR with the respective output directory. For available training options, please take a look at configs/default.yaml.

2D-GAN Baseline

For convinience, we have implemented a 2D-GAN baseline which closely follows this GAN_stability repo. For example, you can train a 2D-GAN on CompCars at 64x64 pixels similar to our GIRAFFE method by running

python train.py configs/64res/cars_64_2dgan.yaml

Using Your Own Dataset

If you want to train a model on a new dataset, you first need to generate ground truth activations for the intermediate or final FID calculations. For this, you can use the script in scripts/calc_fid/precalc_fid.py. For example, if you want to generate an FID file for the comprehensive cars dataset at 64x64 pixels, you need to run

python scripts/precalc_fid.py  "data/comprehensive_cars/images/*.jpg" --regex True --gpu 0 --out-file "data/comprehensive_cars/fid_files/comprehensiveCars_64.npz" --img-size 64

or for LSUN churches, you need to run

python scripts/precalc_fid.py path/to/LSUN --class-name scene_categories/church_outdoor_train_lmdb --lsun True --gpu 0 --out-file data/church/fid_files/church_64.npz --img-size 64

Note: We apply the same transformations to the ground truth images for this FID calculation as we do during training. If you want to use your own dataset, you need to adjust the image transformations in the script accordingly. Further, you might need to adjust the object-level and camera transformations to your dataset.

Evaluating Generated Images

We provide the script eval_files.py for evaluating the FID score of your own generated images. For example, if you would like to evaluate your images on CompCars at 64x64 pixels, save them to an npy file and run

python eval_files.py --input-file "path/to/your/images.npy" --gt-file "data/comprehensive_cars/fid_files/comprehensiveCars_64.npz"

Futher Information

More Work on Implicit Representations

If you like the GIRAFFE project, please check out related works on neural representions from our group:

More Repositories

1

sdfstudio

A Unified Framework for Surface Reconstruction
Python
1,861
star
2

occupancy_networks

This repository contains the code for the paper "Occupancy Networks - Learning 3D Reconstruction in Function Space"
Python
1,454
star
3

stylegan-t

[ICML'23] StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis
Python
1,122
star
4

transfuser

[PAMI'23] TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving; [CVPR'21] Multi-Modal Fusion Transformer for End-to-End Autonomous Driving
Python
957
star
5

stylegan-xl

[SIGGRAPH'22] StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets
Python
939
star
6

projected-gan

[NeurIPS'21] Projected GANs Converge Faster
Python
876
star
7

unimatch

[TPAMI'23] Unifying Flow, Stereo and Depth Estimation
Python
855
star
8

convolutional_occupancy_networks

[ECCV'20] Convolutional Occupancy Networks
Python
792
star
9

differentiable_volumetric_rendering

This repository contains the code for the CVPR 2020 paper "Differentiable Volumetric Rendering: Learning Implicit 3D Representations without 3D Supervision"
Python
782
star
10

mip-splatting

[CVPR'24 Oral] Mip-Splatting: Alias-free 3D Gaussian Splatting
Python
700
star
11

monosdf

[NeurIPS'22] MonoSDF: Exploring Monocular Geometric Cues for Neural Implicit Surface Reconstruction
Python
535
star
12

shape_as_points

[NeurIPS'21] Shape As Points: A Differentiable Poisson Solver
Python
518
star
13

unisurf

[ICCV'21] UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction
Python
410
star
14

graf

Official code release for "GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis"
Jupyter Notebook
393
star
15

tuplan_garage

[CoRL'23] Parting with Misconceptions about Learning-based Vehicle Motion Planning
Python
370
star
16

kitti360Scripts

This repository contains utility scripts for the KITTI-360 dataset.
Python
353
star
17

neat

[ICCV'21] NEAT: Neural Attention Fields for End-to-End Autonomous Driving
Python
301
star
18

gaussian-opacity-fields

Gaussian Opacity Fields for Efficient and Compact Surface Reconstruction in Unbounded Scenes
Python
285
star
19

occupancy_flow

This repository contains the code for the ICCV 2019 paper "Occupancy Flow - 4D Reconstruction by Learning Particle Dynamics"
Python
207
star
20

plant

[CoRL'22] PlanT: Explainable Planning Transformers via Object-Level Representations
Python
192
star
21

factor-fields

[SIGGRAPH 2023] We provide a unified formula for neural fields (Factor Fields) and a novel dictionary factorization (Dictionary Fields)
Jupyter Notebook
183
star
22

voxgraf

Official code release for VoxGRAF: Fast 3D-Aware Image Synthesis with Sparse Voxel Grids
Python
123
star
23

carla_garage

[ICCV'23] Hidden Biases of End-to-End Driving Models
Python
121
star
24

texture_fields

This repository contains code for the paper 'Texture Fields: Learning Texture Representations in Function Space'.
Python
113
star
25

sledge

SLEDGE: Synthesizing Simulation Environments for Driving Agents with Generative Models
105
star
26

kitti360LabelTool

JavaScript
103
star
27

counterfactual_generative_networks

[ICLR'21] Counterfactual Generative Networks
Python
102
star
28

gta

[ICLR'24] GTA: A Geometry-Aware Attention Mechanism for Multi-view Transformers
Python
95
star
29

murf

[CVPR'24] MuRF: Multi-Baseline Radiance Fields
Python
84
star
30

controllable_image_synthesis

Towards Unsupervised Learning of Generative Models for 3D Controllable Image Synthesis, CVPR 2020
Python
69
star
31

king

[ECCV'22] KING: Generating Safety-Critical Driving Scenarios for Robust Imitation via Kinematics Gradients
Python
61
star
32

handheld_svbrdf_geometry

On Joint Estimation of Pose, Geometry and svBRDF from a Handheld Scanner, CVPR2020
Python
57
star
33

navsim

NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation
Python
52
star
34

connecting_the_dots

This repository contains the code for the paper "Connecting the Dots: Learning Representations for Active Monocular Depth Estimation" https://avg.is.tuebingen.mpg.de/publications/riegler2019cvpr
Python
51
star
35

frequency_bias

Official code for "On the Frequency Bias of Generative Models", NeurIPS 2021
Python
39
star
36

data_aggregation

This repository contains the code for the CVPR 2020 paper "Exploring Data Aggregation in Policy Learning for Vision-based Urban Autonomous Driving"
Python
38
star
37

good

[ICLR'23] GOOD: Exploring Geometric Cues for Detecting Objects in an Open World
Python
36
star
38

campari

[3DV'21] CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields
Python
29
star
39

autonomousvision.github.io

Blog of the Autonomous Vision Group at MPI-IS Tübingen and University of Tübingen.
HTML
19
star
40

visual_abstractions

6
star
41

slides

Slide repository of the Autonomous Vision Group at MPI-IS Tübingen and University of Tübingen.
CSS
2
star
42

similarity_reconstruction

This code is based on the paper Exploiting Object Similarity in 3D Reconstruction.
C++
1
star
43

slow_flow

This code is based on the paper Slow Flow: Exploiting High-Speed Cameras for Accurate and Diverse Optical Flow Reference Data.
C++
1
star