• Stars
    star
    118
  • Rank 299,923 (Top 6 %)
  • Language
    Jupyter Notebook
  • Created almost 2 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

ICLR 2023 DeCap: Decoding CLIP Latents for Zero-shot Captioning

Official implementation for DeCap

DeCap: Decoding CLIP Latents for Zero-Shot Captioning via Text-Only Training

Published at ICLR 2023

Paper link: DeCap

Data

Download coco_train to data. Download cc3m_train to data.

Training

./train_coco.sh

or

./train_cc3m.sh

Inferece

See inference_decap.ipynb.

Pretrained models

Train on coco captions: model_coco

Train on CC3M: Soon

Citation

@inproceedings{lidecap,
  title={DeCap: Decoding CLIP Latents for Zero-Shot Captioning via Text-Only Training},
  author={Li, Wei and Zhu, Linchao and Wen, Longyin and Yang, Yi},
  booktitle={The Eleventh International Conference on Learning Representations}
}

Acknowledgments

This repository is heavily based on ClipCap. For training we used the data of COCO dataset and Conceptual Captions.