• Stars
    star
    289
  • Rank 143,394 (Top 3 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 7 years ago
  • Updated almost 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Demo code for the paper "Learning SO(3) Equivariant Representations with Spherical CNNs"

Learning SO(3) Equivariant Representations with Spherical CNNs

animation.gif

Abstract

We address the problem of 3D rotation equivariance in convolutional neural networks. 3D rotations have been a challenging nuisance in 3D classification tasks requiring higher capacity and extended data augmentation in order to tackle it. We model 3D data with multi-valued spherical functions and we propose a novel spherical convolutional network that implements exact convolutions on the sphere by realizing them in the spherical harmonic domain. Resulting filters have local symmetry and are localized by enforcing smooth spectra. We apply a novel pooling on the spectral domain and our operations are independent of the underlying spherical resolution throughout the network. We show that networks with much lower capacity and without requiring data augmentation can exhibit performance comparable to the state of the art in standard retrieval and classification benchmarks.

Demo

This repository contains a demo, where we train and test the model on the SO(3)-rotated ModelNet40 dataset.

Check the requirements in requirements.txt. Our codebase has been tested on TensorFlow 1.6 but the dependency is commented out to silence GitHub’s security warnings.

Download the dataset here (7.0Gb).

The following commands should

  • uncompress the downloaded dataset from /tmp to ~/data/m40-so3-64,
  • clone this repo,
  • create a virtualenv,
  • install the requirements,
  • train and test the model.
mkdir -p ~/data/m40-so3-64 && tar -xvf /tmp/m40_tf_csph_3d_48aug_64.tar.gz -C ~/data/m40-so3-64

git clone https://github.com/daniilidis-group/spherical-cnn.git
cd spherical-cnn

virtualenv -p python3 env
source env/bin/activate
pip install -r requirements.txt
TF_CPP_MIN_LOG_LEVEL=1 python3 scripts/train.py \
                               @params/model-64.txt \
                               @params/m40-64.txt \
                               @params/training.txt \
                               --dset_dir ~/data/m40-so3-64 \
                               --logdir /tmp/m40-so3 \
                               --run_id m40-so3

Sample output:

Running on m40-so3
Namespace(...)
Loading dataset from from_cached_tfrecords...
Loading model two_branch. Logdir /tmp/m40-so3
Start training...
epoch=1; lr=0.0010 train: 0.2286, valid: 0.2286
epoch=2; lr=0.0010 train: 0.4335, valid: 0.4335
epoch=3; lr=0.0010 train: 0.5975, valid: 0.5975
(...)
epoch=46; lr=0.0000 train: 0.9630, valid: 0.9630
epoch=47; lr=0.0000 train: 0.9640, valid: 0.9640
epoch=48; lr=0.0000 train: 0.9724, valid: 0.9724
Start testing...
| model   |  train |    val |   test | train time |
| m40-so3 | 0.9724 | 0.9724 | 0.8679 |      88.31 |

This reproduces the SO(3)/SO(3) result in table 1 (86.9%) of the paper. It runs in about 75min on a Nvidia GeForce GTX 1080 Ti, and requires ~2.2 Gb of GPU memory.

Change --dset_dir if you saved the dataset elsewhere.

Call pytest to run the unit tests.

References

Esteves, C., Allen-Blanchette, C., Makadia, A., & Daniilidis, K. Learning SO(3) Equivariant Representations with Spherical CNNs. European Conference on Computer Vision, ECCV 2018 (oral). http://arxiv.org/abs/1711.06721

@article{esteves17_learn_so_equiv_repres_with_spher_cnns,
  author = {Esteves, Carlos and Allen-Blanchette, Christine and Makadia, Ameesh and Daniilidis, Kostas},
  title = {Learning SO(3) Equivariant Representations With Spherical Cnns},
  journal = {CoRR},
  year = {2017},
  url = {http://arxiv.org/abs/1711.06721},
  archivePrefix = {arXiv},
  eprint = {1711.06721},
  primaryClass = {cs.CV},
}

Authors

Carlos Esteves [1], Christine Allen-Blanchette [1], Ameesh Makadia [2], Kostas Daniilidis [1]

[1] GRASP Laboratory, University of Pennsylvania

[2] Google

More Repositories

1

neural_renderer

A PyTorch port of the Neural 3D Mesh Renderer
Python
1,124
star
2

msckf_mono

Monocular MSCKF ROS Node
C++
486
star
3

EV-FlowNet

Code for the paper "EV-FlowNet: Self-Supervised Optical Flow for Event-based Cameras"
Python
167
star
4

event_feature_tracking

This repo contains MATLAB implementations of the event-based feature tracking methods described in "Event-based Feature Tracking with Probabilistic Data Association" and "Event-based Visual Inertial Odometry".
MATLAB
82
star
5

monocap

Code for MonoCap: Monocular Human Motion Capture using a CNN Coupled with a Geometric Prior.
MATLAB
60
star
6

emvn

Demo source code for the paper "Equivariant Multi-View Networks".
Python
56
star
7

polar-transformer-networks

Demo source code for the paper "Esteves, C., Allen-Blanchette, C., Zhou, X. and Daniilidis, K, "Polar Transformer Networks", ICLR 2018.
Python
54
star
8

mvsec

Multi Vehicle Stereo Event Camera Dataset
Python
52
star
9

ffmpeg_image_transport

image transport that uses libavcodec for compression
C++
42
star
10

m3ed

M3ED Dataset
Python
39
star
11

penncosyvio

The PennCOSYVIO data set
37
star
12

swscnn

Demo source code for the paper "Spin-Weighted Spherical CNNs".
23
star
13

cluster_tutorials

Generic slurm tutorials that explain intricacies of the cluster as implemented
Shell
19
star
14

all-graphs-lead-to-rome

Graph Convolutional Networks for multi-image matching
Python
17
star
15

ffmpeg_image_transport_tools

ros package with tools for ffmpeg compressed images
C++
7
star
16

drocap

7
star
17

EvAC3D

Python
7
star
18

daniilidis-group.github.io

home page of the Daniilidis group
HTML
5
star
19

flex_sync

ros package for syncing variable number of topics
C++
3
star
20

downsampling_image_transport

ROS image transport downsampling images in time
C++
2
star
21

grasp_multicam

repo for UPenn's GRASP MultiCam data set
Python
1
star