• Stars
    star
    110
  • Rank 314,941 (Top 7 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 5 years ago
  • Updated about 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Code for paper "Orthogonal Convolutional Neural Networks".

Orthogonal-Convolutional-Neural-Networks

[Project] [Paper]

Overview

This is authors' re-implementation of the orthogonal convolutional neural networks/regularizers described in:
"Orthogonal Convolutional Neural Networks"
Jiayun Wang,  Yubei Chen,  Rudrasis Chakraborty,  Stella X. Yu  (UC Berkeley/ICSI)  in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2020

For quick addition of the orthogonal loss to your network, refer to orth_dist and deconv_orth_dist.

Updates

As a stand-alone feature, we provide code to calculate filter similarities for CNNs (as described in Fig.1a/1c in the paper).

Requirements

Overall architecture

This repo will consist of source code of experiments in the paper. Now we released the code for image classification. For classification on your own datasets, just change the folder path and number of classes.

Image classification

We use imagenet classificaiton as an example. The users can also change the data to CIFAR or other image classification dataset at their interest. The code is heavily based on PyTorch examples.

  • Navigate to "imagenet" folder.

  • We now support orthogonal convolutions for resnet34 and resnet50. You can run resnet50 on imagenet using the following command.

python main_orth50.py --dist-url 'tcp://127.0.0.1:1321' --dist-backend 'nccl' --multiprocessing-distributed --world-size 1 --rank 0 -a resnet50 -j 4 -r 0.5 -b 220 /data/ILSVRC2012 --print-freq 200
  • For more details including multi-gpu settings, please refer to here.

Note

The current code supports multi-GPU settings.

Q & A

Q: What is the difference between "orth_dist" and "deconv_orth_dist" ?

A: As described in our paper, for orthogonalities in fully-connected layers (or certain convolution layers that share the orthogonal condition with fully-connected layers, e.g. kernel size 3x3 and stride 3), we apply "orth_dist". For other convolution layers, we apply "deconv_orth_dist".

Q: Why are specific layers and convs chosen?

A: Empirically we find that for the first conv layer (usually 3 rgb channels to 16/64/... channels), applying the orthogonal loss leads to instable performance.

License and Citation

The use of this software is released under BSD-3.

@inproceedings{wang2019orthogonal,
  title={Orthogonal Convolutional Neural Networks},
  author={Wang, Jiayun and Chen, Yubei and Chakraborty, Rudrasis and Yu, Stella X},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2020}
}