roboflow/awesome-openai-vision-api-experiments

Stars
1,633
Rank 28,643 (Top 0.6 %)
Language
Python
Created about 1 year ago
Updated 9 months ago

roboflow/awesome-openai-vision-api-experiments

roboflow

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Must-have resource for anyone who wants to experiment with and build on the OpenAI vision API 🔥

openai vision api experiments 🧪

👋 Hello

The must-have resource for anyone who wants to experiment with and build on the OpenAI Vision API. This repository serves as a hub for innovative experiments, showcasing a variety of applications ranging from simple image classifications to advanced zero-shot learning models. It's a space for both beginners and experts to explore the capabilities of the Vision API, share their findings, and collaborate on pushing the boundaries of visual AI.

Experimenting with the OpenAI API requires an API 🔑. You can get one here.

⚠️ Limitations

100 API requests per single API key per day.
Can't be used for object detection or image segmentation. We can solve this problem by combining GPT-4V with foundational models like GroundingDINO or Segment Anything (SAM). Please take a look at the example and read our blog post.

🧪 Experiments

experiment	complementary materials	authors
WebcamGPT - chat with video stream		@SkalskiP
HotDogGPT - simple image classification application		@SkalskiP
zero-shot image classifier with GPT-4V		@capjamesg
zero-shot object detection with GroundingDINO + GPT-4V		@capjamesg
GPT-4V vs. CLIP		@capjamesg
GPT-4V with Set-of-Mark (SoM)		Jianwei Yang, Hao Zhang, Feng Li, Xueyan Zou, Chunyuan Li, Jianfeng Gao
GPT-4V on Web		@Jiayi-Pan
automated voiceover of NBA game		@SkalskiP

webcamgpt.mov

🗞️ Must Read Papers

Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V by Jianwei Yang, Hao Zhang, Feng Li, Xueyan Zou, Chunyuan Li, Jianfeng Gao
The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision) by Zhengyuan Yang, Linjie Li, Kevin Lin, Jianfeng Wang, Chung-Ching Lin, Zicheng Liu, Lijuan Wang
GPT-4 System Card by OpenAI

🖊️ Blogs

🦸 Contribution

We would love your help in making this repository even better! Whether you want to add a new experiment or have any suggestions for improvement, feel free to open an issue or pull request.

If you are up to the task and want to add a new experiment, please look at our contribution guide. There you can find all the information you need.

supervision

We write your reusable computer vision tools. 💜

notebooks

Examples and tutorials on using SOTA computer vision models and techniques. Learn everything from old-school ResNet, through YOLO and object-detection transformers like DETR, to the latest models like Grounding DINO and SAM.

Jupyter Notebook

sports

computer vision and sports

maestro

streamline the fine-tuning process for multimodal models: PaliGemma, Florence-2, and Qwen2-VL

inference

A fast, easy-to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models.

roboflow-python

The official Roboflow Python package. Manage your datasets, models, and deployments. Roboflow has everything you need to build a computer vision application.

webcamGPT

webcamGPT - chat with video stream 💬 + 📸

roboflow-100-benchmark

Code for replicating Roboflow 100 benchmark results and programmatically downloading benchmark datasets

Jupyter Notebook

dji-aerial-georeferencing

Detect objects in drone videos and plot them on a map

neuralhash-collisions

A catalog of naturally occurring images whose Apple NeuralHash is identical.

template-python

A template repo holding our common setup for a python project

video-inference

Example showing how to do inference on a video file with Roboflow Infer

polygonzone

A web utility to draw polygons and retrieve their coordinates for computer vision applications.

model-leaderboard

Which model is the best at object detection? Which is best for small or large objects? We compare the results in a handy leaderboard.

auto-annotate

A simple tool for automatic image annotation using Roboflow API

homepage-demo

Build an in-browser model experience like the one on the Roboflow homepage.

blackjack-basic-strategy

A computer vision powered Blackjack basic strategy app powered by Roboflow.

roboflow-computer-vision-utilities

Interface with the Roboflow API and Python package for running inference (receiving predictions) and customizing result images from your Roboflow Train computer vision models.

cvevals

Evaluate the performance of computer vision models and prompts for zero-shot models (Grounding DINO, CLIP, BLIP, DINOv2, ImageBind, models hosted on Roboflow)

gpt-checkup

Monitor the performance of OpenAI's GPT-4V model over time.

roboflow-collect

Passively collect images for computer vision datasets on the edge.

deploy-models-with-grpc-pytorch-asyncio

Article about deploying machine learning models using grpc, pytorch and asyncio

RoboflowExpoExample

quickstart-python

Start using computer vision in two minutes with our interactive Python notebook experience.

Jupyter Notebook

clip_video_app

Flask-based web application designed to compare text and image embeddings using the CLIP model.

supashim

Use Supabase as a drop-in replacement for Firebase

roboflow-api-snippets

repo for versioning snippets that show how to use Roboflow APIs

rabbit-deterrence

Uses computer vision to deter rabbits from eating your vegetables

cookbooks

Templates for computer vision projects, referenced in Roboflow blog posts.

roboflow-ios-starter

Official starter project for building iOS apps with Roboflow.

cog-vlm-client

Simple CogVLM client script

rickblocker

Audio visual mitigation of Rickrolls using computer vision.

inference-client

inference-server-old

Object detection inference with Roboflow Train models on NVIDIA Jetson devices.

magic-scissors

Synthetic data for object detection and segmentation

streamlit-web-app

A web-based application for testing models trained with Roboflow. Powered by Streamlit.

OBS-Controller

This is a public repo for the Roboflow OBS Gesture Controller. The gesture controller currently responds to four gestures, "Up", "Down", "Stop", and "Grab". Performing these gestures will allow you to transition scenes and grab source objects inside of OBS.

roboflow-react-app

react starter app for roboflow inference

roboflow-nest

Using Roboflow with the Nest camera API

yolov5-custom-training-tutorial

Jupyter Notebook

inference-dashboard-example

Roboflow's inference server to analyze video streams. This project extracts insights from video frames at defined intervals and generates informative visualizations and CSV outputs.

roboflow-100-3d-website

roboflow-100-3d-website

yolov8-OpenVINO

Deploy a YOLOv8 model (ONNX format) to an Amazon SageMaker endpoint for serving inference requests using ONNXRuntime

Jupyter Notebook

roboflow-swift

roboflow-node

Roboflow CLI and API module for node

roboflow-cli

Command Line Interface for Roboflow

roboflow-jetson-license-plate

Mashup Roboflow Object Detection with OCR to read license plates.

stable-diffusion-demo

Generating 1k images using Stable Diffusion and uploading them into your Roboflow project

Jupyter Notebook

scavenger-hunt

Roboflow SXSW Scavenger Hunt game.

supervision-annotators-hf-space

Demo of Annotators through Gradio

foundation-vision-benchmark

A qualitative set of tests for use in evaluating the capabilities of foundation vision models.

streamlit-bccd

Streamlit App for Blood Cell Count Dataset

cheatsheet-supervision

Supervision cheatsheet website, coded up in Svelte

trt-demos

This is a repo for Roboflow TFT python examples.

roboflow-object-counting

Interface with the Roboflow API and Python package for object counting in your computer vision models.

Jupyter Notebook

roboflow-swift-examples

model-library

roboflow-red

A visual way to interact with computer vision using Node-RED

synthetic-fruit-dataset

Code for Roboflow's How to Create a Synthetic Dataset tutorial.

visual-prompting

fast-ai-resnet32

Jupyter Notebook

c3-sapphire-rapids

Jupyter Notebook

inferencejs-react-example

roboflow-object-tracking

smooth-frame

tao-toolkit-with-roboflow

Jupyter Notebook

clip-benchmark

ODinW-RF100-challenge-issues

ODinW RF100 📸 challenge issues/discussions repository

yolov8-website

Source code for the yolov8.com website.

external-bugtracker

stacked-boxes-email-notification

A small project demonstrating how Roboflow's Inference APIs can be used to trigger email notifications.

server-benchmark

A script you can use to benchmark the Roboflow Deploy targets with your custom trained model on your hardware.

lenny

Lenny uses 500+ blog posts, 100+ docs pages, and Roboflow developer documentation to answer questions about computer vision and Roboflow.