• Stars
    star
    269
  • Rank 152,662 (Top 4 %)
  • Language
    Python
  • Created over 4 years ago
  • Updated over 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Implementation of 'X-Linear Attention Networks for Image Captioning' [CVPR 2020]

Introduction

This repository is for X-Linear Attention Networks for Image Captioning (CVPR 2020). The original paper can be found here.

Please cite with the following BibTeX:

@inproceedings{xlinear2020cvpr,
  title={X-Linear Attention Networks for Image Captioning},
  author={Pan, Yingwei and Yao, Ting and Li, Yehao and Mei, Tao},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2020}
}

Requirements

Data preparation

  1. Download the bottom up features and convert them to npz files
python2 tools/create_feats.py --infeats bottom_up_tsv --outfolder ./mscoco/feature/up_down_10_100
  1. Download the annotations into the mscoco folder. More details about data preparation can be referred to self-critical.pytorch

  2. Download coco-caption and setup the path of __C.INFERENCE.COCO_PATH in lib/config.py

  3. The pretrained models and results can be downloaded here.

  4. The pretrained SENet-154 model can be downloaded here.

Training

Train X-LAN model

bash experiments/xlan/train.sh

Train X-LAN model using self critical

Copy the pretrained model into experiments/xlan_rl/snapshot and run the script

bash experiments/xlan_rl/train.sh

Train X-LAN transformer model

bash experiments/xtransformer/train.sh

Train X-LAN transformer model using self critical

Copy the pretrained model into experiments/xtransformer_rl/snapshot and run the script

bash experiments/xtransformer_rl/train.sh

Evaluation

CUDA_VISIBLE_DEVICES=0 python3 main_test.py --folder experiments/model_folder --resume model_epoch

Acknowledgements

Thanks the contribution of self-critical.pytorch and awesome PyTorch team.

More Repositories

1

fast-reid

SOTA Re-identification Methods and Toolbox
Python
3,377
star
2

FaceX-Zoo

A PyTorch Toolbox for Face Recognition
Python
1,863
star
3

dabnn

dabnn is an accelerated binary neural networks inference framework for mobile platform
C++
767
star
4

DCL

Destruction and Construction Learning for Fine-grained Image Recognition
Python
585
star
5

centerX

This repo is implemented based on detectron2 and centernet
Python
554
star
6

CoTNet

This is an official implementation for "Contextual Transformer Networks for Visual Recognition".
Python
512
star
7

VeRidataset

This is the project page for veri dataset which is a large scale image dataset for vehicle re-identification in urban traffic surveillance.
MATLAB
397
star
8

DNNLibrary

Daquexian's NNAPI Library. ONNX + Android NNAPI
C++
346
star
9

lapa-dataset

A large-scale dataset for face parsing (AAAI2020)
258
star
10

Down-to-the-Last-Detail-Virtual-Try-on-with-Detail-Carving

Virtural try-on under arbitrary poses
Python
217
star
11

Partial-Person-ReID

Python
168
star
12

FADA

(ECCV 2020) Classes Matter: A Fine-grained Adversarial Approach to Cross-domain Semantic Segmentation
Python
140
star
13

DSD-SATN

ICCV19: Official code of Human Mesh Recovery from Monocular Images via a Skeleton-disentangled Representation
Python
133
star
14

LIO

Look-into-Object: Self-supervised Structure Modeling for Object Recognition (CVPR 2020)
Jupyter Notebook
113
star
15

PGPT

Implementation of โ€˜Pose-Guided Tracking-by-Detection: Robust Multi-Person Pose Trackingโ€™ [TMM 2020]
Python
47
star
16

CM-NAS

CM-NAS: Cross-Modality Neural Architecture Search for Visible-Infrared Person Re-Identification (ICCV2021)
Python
46
star
17

CoTNet-ObjectDetection-InstanceSegmentation

Python
33
star
18

dabnn-example

Android demo for dabnn
Java
19
star
19

atlasWrapper

C++
8
star