• Stars
    star
    119
  • Rank 296,148 (Top 6 %)
  • Language
    HTML
  • License
    MIT License
  • Created 7 months ago
  • Updated about 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Turn an image of a bookshelf into an interactive SVG.

Make your bookshelf clickable

Use computer vision to generate an SVG that you can overlay onto a photo of your bookshelf that lets you click on each book to find out more information.

Demo

Try the demo

demo.mov

How it Works

This tool uses computer vision to identify and segment each book spine in an image of a bookshelf. Then, each book spine is sent to GPT-4 with Vision to read the book title and, if possible, the author.

This information is then sent to the Google Books API. The book ISBN, author name, and other meta information is retrieved from this API.

An SVG is then created using the segmented book spines. Each book is assigned a polygon which, when clicked, takes you to the Google Books page associated with a book.

This script uses the following vision tools:

  • Grounding DINO (zero-shot object detection model)
  • Segment Anything (image segmentation model)
  • GPT-4 with Vision API
  • OpenCV Python

It takes around 20 seconds to generate the polygons that map to the location of each book on an M1 Macbook Air. It then takes a few seconds to process each book with the OpenAI GPT-4 with Vision API.

For a bookshelf with 11 books, the script takes around one minute to run.

The script returns a HTML file with an SVG file that is overlaid onto the source image.

How to Use

First, clone this project and install the required dependencies:

git clone https://github.com/capjamesg/cv-book-svg
cd cv-book-svg
pip3 install -r requirements.txt

Then, run the main script:

python3 grounded.py --image=example.jpg --output=annotation.html

This script takes an image as input (PNG, JPEG) and outputs a HTML document.

Limitations

This system may:

  • Not identify all books on a bookshelf (thin books are more likely to not be identified).
  • Generate a link to the wrong Google Books URL (which will happen if a book is not available on Google Books, or if a book has a generic title like "Poems of Emily Dickinson", which could on its own refer to several publications).
  • Mis-identify some books.

Notes

  • video.py contains a work-in-progress system for identifying all unique books in a video.

License

This project is licensed under an MIT license.

Contributing

Found a bug? Have an idea that you'd like to see in the project? Open an Issue in this GitHub repository.

More Repositories

1

visionscript

A high-level programming language for using computer vision.
Python
339
star
2

aurora

A fast, extensible static site generator implemented in Python. ✨
Python
163
star
3

knowledge-graph-language

A query language for exploring knowledge graphs.
Python
135
star
4

sam-gpt4v

Use Grounding DINO, Segment Anything, and GPT-4V to label images with segmentation masks for use in training smaller, fine-tuned models.
Python
64
star
5

bsky.link

Generate embeddable link previews to posts on Bluesky.
Nunjucks
27
star
6

sam-clip

Use Grounding DINO, Segment Anything, and CLIP to label objects in images.
Python
22
star
7

vinyl-record-indexing

A system for indexing vinyl records.
Python
22
star
8

indieweb-search

Source code for the IndieWeb search engine.
Python
22
star
9

indieweb-utils

Utilities to aid the implementation of various IndieWeb specifications and functionalities. Built with Python.
Python
20
star
10

open-shelves

An open source computer vision project to identify book spines.
Python
14
star
11

spreadsheet

A spreadsheet engine implemented in Python.
Python
12
star
12

nanosearch

Build a search engine from a website sitemap.
Python
12
star
13

webmemex.js

Display cards for all of the outgoing links on a web page.
JavaScript
11
star
14

cinnamon

A social reader built with Python Flask.
Python
10
star
15

webmention-receiver

A webmention receiver written in Python Flask with sqlite3.
Python
8
star
16

aurora-blog-template

A blog template made with the Aurora static site generator.
HTML
7
star
17

jamesql

An in-memory NoSQL database implemented in Python.
JavaScript
7
star
18

SEOtools

A set of utilities for SEOs and web developers with which to complete common tasks.
Python
6
star
19

build-a-search-index

Code to accompany the "Build a search index in Python" tutorial.
Python
6
star
20

pysurprisal

Calculate surprisal for words in text.
Python
5
star
21

google-indexing-api

Use the Google Indexing API to submit URLs for indexing to Google Search.
Python
5
star
22

papers-with-code-rss

Papers with Code RSS feeds.
Python
5
star
23

awesome-clip-projects

A list of projects that use OpenAI's CLIP model.
4
star
24

airport-pianos

Helping place more pianos in airports.
Nunjucks
4
star
25

avtr.dev

Retrieve an avatar for a URL.
Perl
4
star
26

interactive-image-svg

An experiment with an interactive image in a HTML document with SVG overlays.
HTML
4
star
27

jamesg-indieauth

An IndieAuth endpoint built with Python Flask.
Python
4
star
28

zero-shot-crack-detection

Zero-shot crack detection with SAM and Grounding DINO.
Python
4
star
29

indieweb-etherpad-archiver

Perl tool for archiving Etherpad links to the IndieWeb wiki.
Perl
3
star
30

awsnap.js

Navigate websites by clicking your fingers and saying the link you want to visit.
HTML
3
star
31

web-calendar

A web component for rendering static calendars.
JavaScript
3
star
32

llm-chatbot

A chatbot that references documents in a limited corpus to answer questions.
Python
3
star
33

adventures-with-compression

My adventures with compression.
Python
3
star
34

indieweb-search-links

Link analysis for IndieWeb Search
Python
3
star
35

micropub

A Micropub client and server implemented in Python Flask.
Python
3
star
36

web-reader

A minimal web reader.
Python
3
star
37

hugging-face-papers-rss

An RSS feed for Hugging Face Papers.
Python
3
star
38

autowrite-v2

A personal predictive text engine with a web client.
HTML
3
star
39

taytaylyricofthe.day

Challenge: Fill in the missing word of a Taylor Swift lyric every day.
HTML
3
star
40

autowrite

Context-aware autocomplete and autocorrect powered by word surprisals.
HTML
2
star
41

hovercard.js

A script to load cards when you hover over a link in an article.
JavaScript
2
star
42

drag-and-drop-list

A web component that lets you drag and drop items in a list to reorder items.
JavaScript
2
star
43

computer-vision-challenges

Test your knowledge of computer vision with these challenges.
2
star
44

html-timelines

Make a timeline with HTML.
HTML
2
star
45

linguist.link

Find the most surprising words and most common n-grams on a web page.
Python
2
star
46

salmention

Salmention playground.
HTML
2
star
47

openai-blog-rss

An RSS feed for the OpenAI blog.
Python
2
star
48

microsub-opml-utils

Import OPML files into a Microsub server and export Microsub subscriptions to an OPML file.
Ruby
2
star
49

guessthechar

Guess the missing character.
Python
2
star
50

seasonal.js

Change an emoji on your website for different seasonal events.
JavaScript
2
star
51

calendar-date

A web component to show calendar dates on your website.
JavaScript
2
star
52

spontaneity-rss

RSS Feed for @telepathics' Spontaneity Generator
JavaScript
2
star
53

website-trading-cards

Generate a trading card for your personal website.
HTML
2
star
54

image-map-maker

A tool to make image maps.
HTML
2
star
55

disinfo-domains

Check if a domain has been flagged as associated with disinformation on Wikipedia.
Python
2
star
56

random-boolean-networks

Experimentation with random boolean networks.
Python
2
star
57

pyatproto

A wrapper for interacting with the AT Protocol API.
Python
2
star
58

screenshots

A service to serve meta images for my personal website.
HTML
2
star
59

stories.js

A HTML component that enables stories on your personal website.
HTML
1
star
60

awesome-grounding-dino

A curated list of tools using and applications of Grounding DINO.
1
star
61

visionscript-examples

Unofficial VisionScript examples with notes used for brainstorming language design.
1
star
62

visionscript-vscode-highlight

A TextMate Grammar (tmGrammar) for use with VisionScript.
JavaScript
1
star
63

wiki

A personal wiki for documenting projects. Authentication powered by IndieAuth.
Python
1
star
64

fine-tune-clip

Fine-tune CLIP.
Python
1
star
65

jamesg.blog.deb

An exploration of the .deb package format.
1
star
66

recommend-firefox

A web component for recommending the Firefox web browser.
JavaScript
1
star
67

hacker-news-poetry

Poetry generated programmatically from the front page of Hacker News. Reset hourly.
HTML
1
star
68

pandoc

The Pandoc script I used to generate the "Software Technical Writing: A Guidebook" e-book.
Shell
1
star
69

bpe

Byte-pair encoding implementation in Python.
Python
1
star
70

cash-counter-mobile

HTML
1
star
71

python-indieauth-helpers

IndieAuth authorization and callback helpers written in Python.
Python
1
star
72

letsjam

A static site generator built with Python and jinja2.
Python
1
star
73

highlight.js

Inline text highlights for web pages.
JavaScript
1
star
74

python-packaging-best-practices

Best practices for packaging and documenting Python packages ✨
1
star
75

taylor-swift-acronyms

Analysis of Taylor Swift song name acronyms.
Python
1
star
76

title-prettier

A Python tool to normalize titles into a consistent format.
Python
1
star
77

wikitrends

Visualize Wikipedia traffic for specific pages.
HTML
1
star
78

lispDOT

A DOT DSL implemented in Common Lisp.
Common Lisp
1
star
79

trackback-server

A server to receive Trackbacks.
HTML
1
star
80

coffee-decision-tree

A decision tree showing how I decide what coffee to drink in a given moment.
Mermaid
1
star
81

fragmention.js

An implementation of the Fragmentation specification in JavaScript.
JavaScript
1
star
82

image-collage

Generate an image collage with computer vision.
Python
1
star
83

naive-bayes

A NaΓ―ve Bayes classifer implemented in Python.
Python
1
star
84

train-station-pianos

A directory of train station pianos.
HTML
1
star
85

rhyme-dictionary

A minimal rhyme dictionary.
HTML
1
star
86

vibes

Documenting a vibe in HTML.
HTML
1
star
87

venue-page-experiments

Exploration into simple, information-focused venue page design (i.e. cafes, restaurants).
HTML
1
star
88

hn-webmention

Send Webmentions from Hacker News to your personal website.
Python
1
star