Official pytorch implementation of C3-GAN (ICLR 2022)
Paper]
Contrastive Fine-grained Class Clustering via Generative Adversarial Networks [Yunji Kim, Jung-Woo Ha
Authors:
Abstract
Unsupervised fine-grained class clustering is a practical yet challenging task due to the difficulty of feature representations learning of subtle object details. We introduce C3-GAN, a method that leverages the categorical inference power of InfoGAN with contrastive learning. We aim to learn feature representations that encourage a dataset to form distinct cluster boundaries in the embedding space, while also maximizing the mutual information between the latent code and its image observation. Our approach is to train a discriminator, which is also used for inferring clusters, to optimize the contrastive loss, where image-latent pairs that maximize the mutual information are considered as positive pairs and the rest as negative pairs. Specifically, we map the input of a generator, which was sampled from the categorical distribution, to the embedding space of the discriminator and let them act as a cluster centroid. In this way, C3-GAN succeeded in learning a clustering-friendly embedding space where each cluster is distinctively separable. Experimental results show that C3-GAN achieved the state-of-the-art clustering performance on four fine-grained image datasets, while also alleviating the mode collapse phenomenon.
I. Things to do before running the code
The initial code is optimized for CUB dataset.
β» Hyperparameters setting
You can adjust various hyperparemeters' values such as the number of clusters, the degree of perturbation, etc. in config.py file.
β» Annotate data for evaluation
It is required to annotate each image with its ground truth class label for evaluating Accuracy (ACC) and Normalized Mutual Information (NMI) scores. The class information should be represented in the int format. Please check out sample files in data/cub. You may also have to adjust datasets.py file depending on where you saved the image files and how you made the annotation files.
II. Train
If you have set every arguments in config.py file, training will be started with the simple command below.
python train.py
β» Trained models
For loading parameters of the trained models, please adjust the value of cfg.NUM_GT_CLASSES & cfg.OVER following the table below and set cfg.MODEL_PATH to wherever you saved the file.
Depending on the initial weights, there are variations in terms of clustering quality and sampling quality in the trained models. Since we chose to share ones that have better sampling quality, the scores may not align with the numbers in the paper. (diff ~ 1 point)
Dataset | cfg.NUM_GT_CLASSES | cfg.OVER | parameters |
---|---|---|---|
CUB | 200 | 2 | link |
Stanford Cars | 196 | 3 | link |
Stanford Dogs | 120 | 3 | link |
Oxford Flower | 102 | 3 | link |
III. Results
β» Fine-grained Class Clustering Results
Acc | NMI | |||||||
Bird | Car | Dog | Flower | Bird | Car | Dog | Flower | |
IIC | 7.4 | 4.9 | 5.0 | 8.7 | 0.36 | 0.27 | 0.18 | 0.24 |
SimCLR + k-Means | 8.4 | 6.7 | 6.8 | 12.5 | 0.40 | 0.33 | 0.19 | 0.29 |
InfoGAN | 8.6 | 6.5 | 6.4 | 23.2 | 0.39 | 0.31 | 0.21 | 0.44 |
FineGAN | 6.9 | 6.8 | 6.0 | 8.1 | 0.37 | 0.33 | 0.22 | 0.24 |
MixNMatch | 10.2 | 7.3 | 10.3 | 39.0 | 0.41 | 0.34 | 0.30 | 0.57 |
SCAN | 11.9 | 8.8 | 12.3 | 56.5 | 0.45 | 0.38 | 0.35 | 0.77 |
C3-GAN | 27.6 | 14.1 | 17.9 | 67.8 | 0.53 | 0.41 | 0.36 | 0.67 |
β» Image Generation Results
Conditional Generation
Images synthesized with the cluster indices of real images that were predicted by the discriminator.
Random Generation
Images synthesized by controlling values of the latent code c and the random noise z.
β»β» bibtex
@article{kim2022c3gan,
title={Contrastive Fine-grained Class Clustering via Generative Adversarial Networks},
author={Kim, Yunji and Ha, Jung-Woo},
year={2022},
booktitle = {ICLR}
}
β»β» Acknowledgement
This code was developed from the released source code of FineGAN: Unsupervised Hierarchical Disentanglement for Fine-grained Object Generation and Discovery.
License
Copyright 2022-present NAVER Corp.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.