• Stars
    star
    1,057
  • Rank 43,651 (Top 0.9 %)
  • Language
  • Created over 6 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A curated list of image captioning and related area resources. :-)

Awesome Image CaptioningAwesome

A curated list of image captioning and related area. :-)

Contributing

Please feel free to send me pull requests or email ([email protected]) to add links. Markdown format:

- [Paper Name](link) - Author 1 et al, `Conference Year`. [[code]](link)

Change Log

  • May 25 An up-to-date paper list about vision-and-language pre-training is available here.

Table of Contents

Papers

Survey

Before

2015

CVPR 2015

ICCV 2015

NIPS 2015

ICML 2015

arXiv preprint 2015

2016

CVPR 2016

ACMMM 2016

ACL 2016

arXiv preprint 2016

2017

CVPR 2017

ICCV 2017

AAAI 2017

NIPS 2017

TPAMI 2017

arXiv preprint 2017

2018

CVPR 2018

ECCV 2018

AAAI 2018

NeurIPS 2018

NAACL 2018

ACL 2018

EMNLP 2018

arXiv preprint 2018

2019

CVPR 2019

AAAI 2019

ACL 2019

BMVC 2019

ICCV 2019

NeurIPS 2019

IJCAI 2019

EMNLP 2019

CoNLL 2019

2020

AAAI 2020

CVPR 2020

ACL 2020

ECCV 2020

EMNLP 2020

NeurIPS 2020

Dataset

Image Captioning Challenge

Popular Implementations

PyTorch

TensorFlow

Torch

Others

Licenses

CC0

To the extent possible under law, Zhihong Chen has waived all copyright and related or neighboring rights to this work.

More Repositories

1

awesome-radiology-report-generation

A curated list of radiology report generation (medical report generation) and related areas. :-)
171
star
2

M3AE

[MICCAI-2022] This is the official implementation of Multi-Modal Masked Autoencoders for Medical Vision-and-Language Pre-Training.
Python
111
star
3

R2Gen

[EMNLP-2020] The official implementation of Generating Radiology Reports via Memory-driven Transformer.
Python
79
star
4

R2GenCMN

[ACL-2021] The official implementation of Cross-modal Memory Networks for Radiology Report Generation.
Python
73
star
5

awesome-few-shot-learning-in-nlp

A curated list of few-shot learning in NLP. :-)
65
star
6

PTUnifier

[ICCV-2023] Towards Unifying Medical Vision-and-Language Pre-training via Soft Prompts
Python
59
star
7

awesome-vision-and-language-pretraining

A curated list of vision-and-language pre-training (VLP). :-)
56
star
8

circleloss.pytorch

Examples of playing with Circle Loss from the paper "Circle Loss: A Unified Perspective of Pair Similarity Optimization", CVPR 2020.
Python
49
star
9

ARL

[ACMMM-2022] This is the official implementation of Align, Reason and Learn: Enhancing Medical Vision-and-Language Pre-training with Knowledge.
Python
32
star
10

SK-VG

[CVPR-2023] The official dataset of Advancing Visual Grounding with Scene Knowledge: Benchmark and Method.
28
star
11

awesome-reinforcement-learning-in-nlp

A curated list of reinforcement learning in NLP. :-)
20
star
12

awesome-disentanglement-in-nlp

A curated list of disentanglement in NLP. :-)
17
star
13

awesome-zero-shot-learning-in-nlp

A curated list of zero-shot learning in NLP. :-)
13
star
14

bert-clip-synesthesia

[Findings of ACL-2023] This is the official implementation of On the Difference of BERT-style and CLIP-style Text Encoders.
Jupyter Notebook
12
star
15

awesome-causality-in-nlp

A curated list of causality in NLP. :-)
8
star
16

arXiv-text-generation-papers

A curated list of text generation papers in arXiv.
8
star
17

awesome-attack-and-defense-in-nlp

A curated list of attack and defense in NLP. :-)
5
star
18

weakly-supervised-segmentation

weakly supervised medical image segmentation
Python
5
star
19

awesome-nlp-surveys

A curated list of surveys in NLP. :-)
2
star
20

awesome-contrastive-learning-in-nlp

A curated list of contrastive learning in NLP. :-)
2
star
21

awesome-noisy-channel-model

A curated list of noisy channel model and related areas. :-)
1
star
22

awesome-interesting-topics-in-nlp

A curated list of interesting topics in NLP. :-)
1
star
23

mae.pytorch

Simple and clean implementation of MAE (Masked Autoencoders Are Scalable Vision Learners) using Huggingface Transformers.
Python
1
star