• Stars
    star
    105
  • Rank 328,196 (Top 7 %)
  • Language
    Python
  • Created about 5 years ago
  • Updated almost 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Code for our NeurIPS'19 paper "Learning Deep Bilinear Transformation for Fine-grained Image Representation"

DBTNet

MXNet version of the code for our NeurIPS'19 paper "Learning Deep Bilinear Transformation for Fine-grained Image Representation"

Bilinear feature transformation has shown the state-of-the-art performance in learning fine-grained image representations. The proposed DBTNet can deeply integrate bilinear features into CNN to learn fine-grained image representations.

Framework

alt text

Main Results

Method Dimension CUB-200-2011 Stanford-Car Aircraft
Compact Bilinear 14k 81.6 88.6 81.6
Kernel Pooling 14k 84.7 91.1 85.7
iSQRT-COV 8k 87.3 91.7 89.5
iSQRT-COV 32k 88.1 92.8 90.0
DBTNet-50 (ours) 2k 87.5 94.1 91.2
DBTNet-101 (ours) 2k 88.1 94.5 91.6

Prerequisites

MXNet 1.3.1

GluonCV 0.3.0

Quick Start

Prepare the data:

download the imagenet data:

cd data/imagenet/
wget https://australiav100data.blob.core.windows.net/heliang/imagenet_train.rec
wget https://australiav100data.blob.core.windows.net/heliang/imagenet_train.idx
wget https://australiav100data.blob.core.windows.net/heliang/imagenet_val.rec
wget https://australiav100data.blob.core.windows.net/heliang/imagenet_val.idx

download the CUB-200-2011 dataset:

cd data/
wget https://australiav100data.blob.core.windows.net/heliang/cub.tar
tar -xvf cub.tar

Train the model on ImageNet dataset:

cd code/
bash train_imagenet_dbt.sh

Fine-tune the model on CUB-200-2011 dataset:

The ImageNet pretrained model is available.

cd code/
bash ft_cub_dbt.sh

Pytorch Version

On going. Welcome to reimplement and share the DBT code in pytorch.

Citation

If any part of our paper and code is helpful to your work, please generously cite with:

@incollection{NIPS2019_8680,
title = {Learning Deep Bilinear Transformation for Fine-grained Image Representation},
author = {Zheng, Heliang and Fu, Jianlong and Zha, Zheng-Jun and Luo, Jiebo},
booktitle = {Advances in Neural Information Processing Systems 32},
pages = {4279--4288},
year = {2019}

More Repositories

1

TTSR

[CVPR'20] TTSR: Learning Texture Transformer Network for Image Super-Resolution
Python
765
star
2

SiamDW

[CVPR'19 Oral] Deeper and Wider Siamese Networks for Real-Time Visual Tracking
Python
750
star
3

Stark

[ICCV'21] Learning Spatio-Temporal Transformer for Visual Tracking
Python
645
star
4

TracKit

[ECCV'20] Ocean: Object-aware Anchor-Free Tracking
Python
612
star
5

STTN

[ECCV'2020] STTN: Learning Joint Spatial-Temporal Transformations for Video Inpainting
Jupyter Notebook
465
star
6

AOT-GAN-for-Inpainting

[TVCG'2023] AOT-GAN for High-Resolution Image Inpainting (codebase for image inpainting)
Python
424
star
7

LightTrack

[CVPR21] LightTrack: Finding Lightweight Neural Network for Object Tracking via One-Shot Architecture Search
Python
396
star
8

MM-Diffusion

[CVPR'23] MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
Python
389
star
9

PEN-Net-for-Inpainting

[CVPR'2019] PEN-Net: Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting
Python
357
star
10

img2poem

[MM'18] Beyond Narrative Description: Generating Poetry from Images by Multi-Adversarial Training
Python
280
star
11

tasn

Trilinear Attention Sampling Network for Fine-grained Image Recognition
Python
218
star
12

soho

[CVPR'21 Oral] Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning
Python
206
star
13

TTVSR

[CVPR'22 Oral] TTVSR: Learning Trajectory-Aware Transformer for Video Super-Resolution
Python
199
star
14

FTVSR

[ECCV'22] FTVSR: Learning Spatiotemporal Frequency-Transformer for Compressed Video Super-Resolution
Python
154
star
15

generate-it

A collection of models for image<->text generation in ACM MM 2021.
Python
64
star
16

CKDN

[ICCV'21] CKDN: Learning Conditional Knowledge Distillation for Degraded-Reference Image Quality Assessment
Python
55
star
17

SariGAN

[NeurIPS'20] Learning Semantic-aware Normalization for Generative Adversarial Networks
Python
53
star
18

VOT2019

The Winner and Runner-up Trackers for VOT-2019 Challenges
Python
51
star
19

WSOD2

[ICCV'19] WSOD^2: Learning Bottom-up and Top-down Objectness Distillation for Weakly-supervised Object Detection
Python
47
star
20

VQD-SR

[ICCV'23] VQD-SR: Learning Data-Driven Vector-Quantized Degradation Model for Animation Video Super-Resolution
Python
37
star
21

CyDAS

Cyclic Differentiable Architecture Search
Python
34
star
22

NEAS

Python
19
star
23

2D-TAN

AAAI2020 - Learning 2D Temporal Localization Networks for Moment Localization with Natural Language
Python
17
star
24

STTR

[ACCV'22] Fine-Grained Image Style Transfer with Visual Transformers
Python
14
star
25

AAST-pytorch

[MM'20] Aesthetic-Aware Image Style Transfer
Python
14
star
26

davinci-videofactory

JavaScript
12
star
27

AI_Illustrator

[MM'22 Oral] AI Illustrator: Translating Raw Descriptions into Images by Prompt-based Cross-Modal Generation
Python
11
star
28

language-guided-animation

[TMM 2023] Language-Guided Face Animation by Recurrent StyleGAN-based Generator
Python
11
star