• Stars
    star
    231
  • Rank 172,483 (Top 4 %)
  • Language
    Jupyter Notebook
  • License
    MIT License
  • Created over 1 year ago
  • Updated 6 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset

VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset

This is the official repository of VAST which will provide code, model checkpoint and dataset. They will be released after paper is accepted.

PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC

Citation

If you find this code useful for your research, please consider citing:

@article{chen2023vast,
  title={VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset},
  author={Chen, Sihan and Li, Handong and Wang, Qunbo and Zhao, Zijia and Sun, Mingzhen and Zhu, Xinxin and Liu, Jing},
  journal={arXiv preprint arXiv:2305.18500},
  year={2023}
}