• Stars
    star
    110
  • Rank 316,770 (Top 7 %)
  • Language
    Python
  • Created over 6 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Code for Discriminability objective for training descriptive captions(CVPR 2018)

Discriminability objective for training descriptive captions

This is the implementation of paper Discriminability objective for training descriptive captions.

Requirements

Python 2.7 (because there is no coco-caption version for python 3)

PyTorch 1.0 (along with torchvision)

java 1.8 for (coco-caption)

Downloads

Clone the repository

git clone --recursive https://github.com/ruotianluo/DiscCaptioning.git

Data split

In this paper we use the data split from Context-aware Captions from Context-agnostic Supervision. It's different from standard karpathy's split, so we need to download different files.

Download link: Google drive link

To train on your own, you only need to download dataset_coco.json, but it's also suggested to download cocotalk.json and cocotalk_label.h5 as well. If you want to run pretrained model, you have to download all three files.

coco-caption

cd coco-caption
bash ./get_stanford_models.sh
cd annotations
# Download captions_val2014.json from the google drive link above to this folder
cd ../../

The reason why we need to replace the captions_val2014.json is because the original file can only evaluate images from the val2014 set, and we are using rama's split.

Pre-computed feature

In this paper, for retrieval model, we use outputs of last layer of resnet-101. For captioning model, we use the bottom-up feature from https://arxiv.org/abs/1707.07998.

The features can be downloaded from the same link, and you need to compress them to data/cocotalk_fc and data/cocobu_att respectively.

Pretrained models.

Download pretrained models from link. Decompress them into root folder.

To evaluate on pretrained model, run:

bash eval.sh att_d1 test

The pretrained models can match the results shown in the paper.

Train on you rown

Preprocessing

Preprocess the captions (skip if you already have 'cocotalk.json' and 'cocotalk_label.h5'):

$ python scripts/prepro_labels.py --input_json data/dataset_coco.json --output_json data/cocotalk.json --output_h5 data/cocotalk

Preprocess for self-critical training:

$ python scripts/prepro_ngrams.py --input_json data/dataset_coco.json --dict_json data/cocotalk.json --output_pkl data/coco-train --split train

Start training

First train a retrieval model:

bash run_fc_con.sh

Second, pretrain the captioning model.

bash run_att.sh

Third, finetune the captioning model with cider+discriminability optimization:

bash run_att_d.sh 1 (1 is the discriminability weight, and can be changed to other values)

Evaluate

bash eval.sh att_d1 test

Citation

If you found this useful, please consider citing:

@InProceedings{Luo_2018_CVPR,
author = {Luo, Ruotian and Price, Brian and Cohen, Scott and Shakhnarovich, Gregory},
title = {Discriminability Objective for Training Descriptive Captions},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2018}
}

Acknowledgements

The code is based on ImageCaptioning.pytorch

More Repositories

1

pytorch-faster-rcnn

pytorch1.0 updated. Support cpu test and demo. (Use detectron2, it's a masterpiece)
Jupyter Notebook
1,818
star
2

ImageCaptioning.pytorch

I decide to sync up this repo and self-critical.pytorch. (The old master is in old master branch for archive)
Python
1,349
star
3

self-critical.pytorch

Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.
Python
942
star
4

pytorch-resnet

Convert resnet trained in caffe to pytorch model. (group norm resnet is provided too)
Python
227
star
5

Image_Captioning_AI_Challenger

Code for AI Challenger contest. (Generating chinese image captions)
Python
210
star
6

Transformer_Captioning

Use transformer for captioning
Python
156
star
7

NeuralDialog-CVAE-pytorch

OpenEdge ABL
95
star
8

Faster-RCNN-Densecap-torch

Faster-RCNN based on Densecap(deprecated)
Jupyter Notebook
86
star
9

zsl-gcn-pth

zero-shot-gcn in pytorch
Python
72
star
10

neuraltalk2-tensorflow

Neuraltalk2 in tensorflow
Python
58
star
11

Context-aware-ZSR

Official code for paper Context-aware Zero-shot Recognition (https://arxiv.org/abs/1904.09320 to appear at AAAI2020)
Python
57
star
12

baipiao_jianying

白嫖剪映的语音识别(学习分享)
Python
55
star
13

GoogleConceptualCaptioning

Python
53
star
14

pytorch-mobilenet-from-tf

Mobilenet model converted from tensorflow
Jupyter Notebook
49
star
15

bottom-up-attention-ai-challenger

Jupyter Notebook
38
star
16

lmdbdict

A simple wrapper for lmdb. Support dict-like operations.
Python
21
star
17

rtutils

Python
17
star
18

lazy_related_work

Python
14
star
19

refexp-comprehension

Referring expression comprehension on ReferIt(RefClef)
Lua
10
star
20

canada_us_visa_spotter

Python
7
star
21

ruotianluo.github.io

SCSS
4
star
22

play_with_jax

My attempt to learn jax.
2
star