• Stars
    star
    218
  • Rank 180,716 (Top 4 %)
  • Language
    Python
  • Created over 5 years ago
  • Updated almost 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Trilinear Attention Sampling Network for Fine-grained Image Recognition

TASN

Code (MXNet version) for our cvpr'19 paper "Looking for the Devil in the Details: Learning Trilinear Attention Sampling Network for Fine-grained Image Recognition"

alt text

Prerequisites

cuda version = 8.0 cudnn5.0 nccl libopenblas liblapack libopencv

Install

First clone this repository:

sudo git clone https://github.com/Heliang-Zheng/TASN.git
cd TASN/tasn-mxnet

Then, please follow https://mxnet.incubator.apache.org/install/build_from_source.html to compile and install mxnet.

Or download pre-build mxnet (with cuda 8.0): https://drive.google.com/open?id=1Sfpw0x5XLqBFWAt99-zKOp4jAbOxm5Ws and install by:

cd TASN/tasn-mxnet/example/tasn
sudo bash install.sh

Train TASN

  1. get into the tasn dir:

     cd TASN/tasn-mxnet/example/tasn
    
  2. download data and pretrained model (on ImageNet):

     sudo bash init.sh
    
  3. set your nccl path in train.sh

  4. run :

     sudo bash train.sh
    

Experiments settings: on CUB-200-2011 dataset : http://www.vision.caltech.edu/visipedia/CUB-200-2011.html

CNN input resolution: 224*224

Accuracy: 87.0%

Just changing the scale of AttSampler() in train.py from 224/512 to 336/512 to obtain the accuracy of 88.0%

Model:

cub_224_87 https://drive.google.com/open?id=1uw9MVNVZqBTppN4TBbHB10CxoQonsTx9

cub_336_88 https://drive.google.com/open?id=1qQo8o2C5JpwxJGhrfk2xHM-f6kpxDKd1

Added files:

example/tasn/*

src/operator/contrib/att_sampler-inl.h

src/operator/contrib/att_sampler.cc

src/operator/contrib/att_sampler.cu

PyTorch versioin

On going.

Add master net (85.5%)

  • part net (86.2%) without distilling.

Thank https://github.com/ShenghaiRong for reimplementing Attention sampler for pytorch verion.

I would be very busy in the nearly future and cannot find time to finish the reimplement of pytorch version. If anyone can tune and finish the reimplement, feel free to create a pull request.

Other Implements

Attention sampler implementation (free from rebuilding mxnet):

https://github.com/wkcn/AttentionSampler

Reference

@inproceedings{zheng2019looking, title={Looking for the Devil in the Details: Learning Trilinear Attention Sampling Network for Fine-grained Image Recognition}, author={Zheng, Heliang and Fu, Jianlong and Zha, Zheng-Jun and Luo, Jiebo}, booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition}, pages={5012--5021}, year={2019} }

More Repositories

1

TTSR

[CVPR'20] TTSR: Learning Texture Transformer Network for Image Super-Resolution
Python
756
star
2

SiamDW

[CVPR'19 Oral] Deeper and Wider Siamese Networks for Real-Time Visual Tracking
Python
747
star
3

Stark

[ICCV'21] Learning Spatio-Temporal Transformer for Visual Tracking
Python
628
star
4

TracKit

[ECCV'20] Ocean: Object-aware Anchor-Free Tracking
Python
608
star
5

STTN

[ECCV'2020] STTN: Learning Joint Spatial-Temporal Transformations for Video Inpainting
Jupyter Notebook
462
star
6

AOT-GAN-for-Inpainting

[TVCG'2023] AOT-GAN for High-Resolution Image Inpainting (codebase for image inpainting)
Python
416
star
7

LightTrack

[CVPR21] LightTrack: Finding Lightweight Neural Network for Object Tracking via One-Shot Architecture Search
Python
387
star
8

MM-Diffusion

[CVPR'23] MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
Python
354
star
9

PEN-Net-for-Inpainting

[CVPR'2019] PEN-Net: Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting
Python
354
star
10

img2poem

[MM'18] Beyond Narrative Description: Generating Poetry from Images by Multi-Adversarial Training
Python
282
star
11

soho

[CVPR'21 Oral] Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning
Python
205
star
12

TTVSR

[CVPR'22 Oral] TTVSR: Learning Trajectory-Aware Transformer for Video Super-Resolution
Python
197
star
13

FTVSR

[ECCV'22] FTVSR: Learning Spatiotemporal Frequency-Transformer for Compressed Video Super-Resolution
Python
151
star
14

DBTNet

Code for our NeurIPS'19 paper "Learning Deep Bilinear Transformation for Fine-grained Image Representation"
Python
105
star
15

generate-it

A collection of models for image<->text generation in ACM MM 2021.
Python
64
star
16

CKDN

[ICCV'21] CKDN: Learning Conditional Knowledge Distillation for Degraded-Reference Image Quality Assessment
Python
55
star
17

SariGAN

[NeurIPS'20] Learning Semantic-aware Normalization for Generative Adversarial Networks
Python
54
star
18

VOT2019

The Winner and Runner-up Trackers for VOT-2019 Challenges
Python
50
star
19

WSOD2

[ICCV'19] WSOD^2: Learning Bottom-up and Top-down Objectness Distillation for Weakly-supervised Object Detection
Python
46
star
20

CyDAS

Cyclic Differentiable Architecture Search
Python
34
star
21

VQD-SR

[ICCV'23] VQD-SR: Learning Data-Driven Vector-Quantized Degradation Model for Animation Video Super-Resolution
Python
34
star
22

NEAS

Python
19
star
23

2D-TAN

AAAI2020 - Learning 2D Temporal Localization Networks for Moment Localization with Natural Language
Python
16
star
24

AAST-pytorch

[MM'20] Aesthetic-Aware Image Style Transfer
Python
14
star
25

STTR

[ACCV'22] Fine-Grained Image Style Transfer with Visual Transformers
Python
12
star
26

davinci-videofactory

JavaScript
12
star
27

AI_Illustrator

[MM'22 Oral] AI Illustrator: Translating Raw Descriptions into Images by Prompt-based Cross-Modal Generation
Python
11
star
28

language-guided-animation

[TMM 2023] Language-Guided Face Animation by Recurrent StyleGAN-based Generator
Python
10
star