Crop-CLIP
You can sponsor me to support my open source work ๐ sponsor
Search subjects/objects in an image using simple text description and get cropped results.
- 2022-1-04 Added colab for YouTube videos
Highlights
Video Results:(Baby Driver Bank Robbery scene)
- Search the scene and zoom-in to the subject.
Search Query on YouTube Video.
"Man in suit"
"Cute boy"
"Search Query - Crop!"
"Whats the time"
"Hoodie guy"
"Mini Cooper"
"Whiskey Bottle"
How?
- This is done by combining Object detection yolov5 and OpenAI's CLIP model.
- Detects and crops objects (yolov5s)
- Encode cropped images using CLIP
- Encode search query using CLIP
- Find the best match
Why?
- #vacation
โบ๏ธ
Can also be used to create datasets with some changes in code. In the below example images of Jack daniels bottle has been croped and saved.
Search Query on batch - "Jack Daniels"
Simple App
Hugging Face Spaces ๐
๐Limitations
- Depends heavily on object detection(yolov5).
- YOLOv5 ๐ is a family of object detection architectures and models pretrained on the COCO dataset, So detection depends on COCO classes.