• Stars
    star
    211
  • Rank 186,867 (Top 4 %)
  • Language
    Lua
  • Created over 9 years ago
  • Updated over 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

The Deep Convolutional Inverse Graphics Network

Deep Convolutional Inverse Graphics Network

This repository contains the code for the network described in http://arxiv.org/abs/1503.03167.

Use Cases:

  • Unsupervised Feature Learning
  • Neural 3D graphics engine: Given a static face image, our model can re-render (hallucinate) the face with arbitrary light and viewpoint transformations. Below is a sample movie generated by our model from a single face photograph -- this is achieved by varying the light neuron and obtaining the image frame prediction at each time step. Same can be done for pose variations (see paper or project website)

A DC-IGN lighting demo

Project Website: http://willwhitney.github.io/dc-ign/www/

Citation

@article{kulkarni2015deep,
  title={Deep Convolutional Inverse Graphics Network},
  author={Kulkarni, Tejas D and Whitney, Will and Kohli, Pushmeet and Tenenbaum, Joshua B},
  journal={arXiv preprint arXiv:1503.03167},
  year={2015}
}

Running

Requirements

Facebook has some great instructions for installing these over at https://github.com/facebook/fbcunn/blob/master/INSTALL.md

Instructions

Dataset and pre-trained network: The train/test dataset can be downloaded from Dropbox or Amazon S3.

A pretrained network is also available if you just want to see the results: Dropbox, Amazon S3

Update 06/23/16: We've been getting a bunch of traffic due to the (highly recommended!) InfoGAN paper, so I've mirrored the files on S3. If neither Dropbox nor S3 works, please email me ([email protected]) and I'll get it to you another way.

Training a network with separated pose/light/shape etc (disentangled representations)

  1. git clone this repo
  2. Download the dataset and unzip it.
  3. Grab a coffee while you wait for that to happen. It's pretty big.
  4. Run th monovariant_main.lua --help to see the available options.
  5. To train from scratch:
    1. run something like th monovariant_main.lua --no_load --name my_first_dcign --datasetdir <path_to_dataset>
    2. [The network will save itself to networks/<name> after each epoch]
    3. After a couple of epochs, open up visualize_networks.lua and set network_search_str to your network's name. Then you can run th visualize_networks.lua and it will create a folder called renderings with some visualizations of the kinds of faces your network generates.
  6. To use a pretrained network:
    1. Download the pretrained network and unzip it.
    2. More coffee while you wait.
    3. Run a command like th monovariant_main.lua --import <path/to/unzipped/network/dir> --name my_first_dcign --datasetdir <path_to_dataset> that imports the directory of that pretrained net.
    4. Or, just do the visualize_networks thing from above with the pretrained network to see what it makes.
  7. The default will run on CPU, to enable cuda please do following
    1. --useCuda --deviceId deviceToUse : Default deviceId is 1.
    2. For cudnn use --useCuda --useCudnn --deviceId deviceToUse.

Training a network with undifferentiated latents

Instructions coming soon, but if you're not afraid of code that hasn't been cleaned up yet, check out main.lua.

Paper abstract

This paper presents the Deep Convolution Inverse Graphics Network (DC-IGN) that aims to learn an interpretable representation of images that is disentangled with respect to various transformations such as object out-of-plane rotations, lighting variations, and texture. The DC-IGN model is composed of multiple layers of convolution and de-convolution operators and is trained using the Stochastic Gradient Variational Bayes (SGVB) algorithm. We propose training procedures to encourage neurons in the graphics code layer to have semantic meaning and force each group to distinctly represent a specific transformation (pose, light, texture, shape etc.). Given a static face image, our model can re-generate the input image with different pose, lighting or even texture and shape variations from the base face. We present qualitative and quantitative results of the modelโ€™s efficacy to learn a 3D rendering engine. Moreover, we also utilize the learnt representation for two important visual recognition tasks: (1) an invariant face recognition task and (2) using the representation as a summary statistic for generative modeling.

Acknowledgements

A big shout-out to all the Torch developers. Torch is simply awesome. We thank Thomas Vetter for giving us access to the Basel face model. T. Kulkarni was graciously supported by the Leventhal Fellowship. This research was supported by ONR award N000141310333, ARO MURI W911NF-13-1-2012 and CBMM. We would also like to thank (y0ast) https://github.com/y0ast for making the variational autoencoder code available online.

More Repositories

1

reprieve

A library for evaluating representations.
Python
76
star
2

understanding-visual-concepts

Unsupervised learning of visual concepts from video
Lua
57
star
3

live-earth-desktop

A tool that continually downloads the latest GOES-East image of the Earth.
Python
46
star
4

dynamics-aware-embeddings

Official implementation of DynE, Dynamics-aware Embeddings for RL
Jupyter Notebook
43
star
5

deep-game-engine

Learning a game engine by example.
Lua
10
star
6

probabilistic-exploration

Using a VAE to reward a Q-learning agent for reaching unlikely states
Lua
8
star
7

inverse-graphics-network

An implementation of (parts of) Tijmen Tieleman's PhD thesis in Theano. (Defunct, moving to Torch7.)
Python
7
star
8

6.863-final-project

A project which solves SAT analogy questions using python and ConceptNet (http://conceptnet5.media.mit.edu/)
Python
6
star
9

6.835-multimodal-fusion

A gesture recognition system involving signal fusion of multiple modalities for 6.835 at MIT.
MATLAB
6
star
10

mathsdl-spring20

Topics course Mathematics of Deep Learning at NYU, Spring 2020
JavaScript
4
star
11

gan-article

An article discussing what generative modeling is and why GANs are good at it.
HTML
4
star
12

Suggestible

An app that uses the power of suggestion to generate plans.
Java
3
star
13

windowsync-client

a chrome extension that keeps one window in sync across multiple computers
CoffeeScript
2
star
14

Spanish-translation-game

Made for 6.813 in a couple of hours.
JavaScript
2
star
15

jaco-simulation

Python
2
star
16

6.835-sketch-recognition

A sketch recognition system implemented in Matlab as part of the class 6.835 at MIT.
MATLAB
2
star
17

6.835-Stroke-Segmentation

the first miniproject for Intelligent Multimodal Interfaces at MIT, Spring 2013
Objective-C
2
star
18

kinova-raw

Wrapper for Kinova's C++ API for talking to the Jaco arm
C++
2
star
19

agency

An experimental RTS where each unit has its own AI โ€”ย written by you.
Elm
2
star
20

6.835-final-project

A system that uses 3D gesture data to manipulate a model.
Python
1
star
21

dissertation

TeX
1
star
22

masters-thesis

TeX
1
star
23

hackSwyp

Swyp Hackathon, February 18th, 6:00pm-2am at Intrepid Labs
1
star
24

simple-news-host

Nothing but the news.
CoffeeScript
1
star
25

program-generator

Clojure
1
star
26

controller-function-networks

Lua
1
star
27

structure-experiments

code for "Disentangling Video with Independent Prediction"
Jupyter Notebook
1
star
28

tribal-sim

a fun little predator-prey (and eventual tribe) simulation in HTML5 canvas.
JavaScript
1
star
29

windowsync-server

This is the Node.JS server for the Chrome extension I'm building, windowsync.
JavaScript
1
star
30

suggestible-server

CoffeeScript
1
star
31

exploration-reimplementation

reimplementing my exploration-for-control work
Jupyter Notebook
1
star
32

jane

Java
1
star
33

dodgy-voter

a dodgy reddit-style voting system I built for a class I taught during MIT's IAP
JavaScript
1
star
34

jax-parallel

Code written for my post on parallelizing with JAX: http://willwhitney.com/parallel-training-jax.html
Jupyter Notebook
1
star
35

Executive-Briefings

A service that provides in-person summaries of technical subjects on short notice.
1
star
36

algorithms-day

problems and code for teaching algorithms
Jupyter Notebook
1
star