Reading list in Transformer
This repo is aimed to collect all the recent popular Transformer paper, codes and learning resources with respect to the domains of Vision Transformer, NLP and multi-modal, etc.
Topics (paper and code)
Review Paper in multi-modal
Tutorials and workshop
-
Cross-View and Cross-Modal Visual Geo-Localization: IEEE CVPR 2021 Tutorial
-
From VQA to VLN: Recent Advances in Vision-and-Language Research: IEEE CVPR 2021 Tutorial
-
Tutorial on MultiModal Machine Learning: IEEE CVPR 2022 Tutorial
Datasets
Blogs
Tools
-
PyTorchVideo a deep learning library for video understanding research
-
horovod a tool for multi-gpu parallel processing
-
accelerate an easy API for mixed precision and any kind of distributed computing