There are no reviews yet. Be the first to send feedback to the community and the maintainers!
Repository Details
Real-time video understanding and interaction through text,audio,image and video with large multi-modal model. 利用多模态大模型的实时视频理解和交互框架,通过文本、语音、图像和视频和这是世界进行问答和交流。