TXH-mercury/VAST

Stars
231
Rank 172,483 (Top 4 %)
Language
Jupyter Notebook
License
MIT License
Created over 1 year ago
Updated 6 months ago

TXH-mercury

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset

VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset

This is the official repository of VAST which will provide code, model checkpoint and dataset. They will be released after paper is accepted.

Citation

If you find this code useful for your research, please consider citing:

@article{chen2023vast,
  title={VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset},
  author={Chen, Sihan and Li, Handong and Wang, Qunbo and Zhao, Zijia and Sun, Mingzhen and Zhu, Xinxin and Liu, Jing},
  journal={arXiv preprint arXiv:2305.18500},
  year={2023}
}

VALOR

Codes and Models for VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset

Python

254

COSA

Codes and Models for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model

Python

TXH-mercury/VAST

TXH-mercury

Reviews

Repository Details

VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset

Citation

More Repositories