There are no reviews yet. Be the first to send feedback to the community and the maintainers!
Repository Details
Like Google's Gemini demo, it harnesses the power of AI to answer questions based on visual input. By integrating GPT-4 Vision for image understanding, Whisper for voice recognition and Resemble AI for voice synthesis, it can interpret visual data and respond verbally