• Stars
    star
    220
  • Rank 175,051 (Top 4 %)
  • Language
    Python
  • License
    Other
  • Created about 4 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A pre-trained YOLO based hand detection network.

YOLO-Hand-Detection

Scene hand detection for real world images.

Hand Detection Example

Idea

To detect hand gestures, we first have to detect the hand position in space. This pre-trained network is able to extract hands out of a 2D RGB image, by using the YOLOv3 neural network.

There are already existing models available, mainly for MobileNetSSD networks. The goal of this model is to support a wider range of images and a more stable detector (hopefully ๐Ÿ™ˆ).

Dataset

The first version of this network has been trained on the CMU Hand DB dataset, which is free to access and download. Because the results were ok, but not satisfying, I used it to pre annotate more images and manually then corrected the pre-annotations.

Because Handtracking by Victor Dibia is using the Egohands dataset, I tried to include it into the training-set as well.

In the end, the training set consists of the CMU Hand DB, the Egohands dataset and my own trained images (mainly from marathon runners), called cross-hands.

Training

The training took about 10 hours on a single NVIDIA 1080TI and was performed with the YOLOv3 default architecture. I also trained the slim version of it called YOLOv3-tiny.

YOLOv3

Training Graph

Precision: 0.89 Recall: 0.85 F1-Score: 0.87 IoU: 69.8

YOLOv3-Tiny

Training Graph

Precision: 0.76 Recall: 0.69 F1-Score: 0.72 IoU: 53.67

YOLOv3-Tiny-PRN

The tiny version of YOLO has been improved by the partial residual networks paper. Because of that I trained YOLO-Tiny-PRN and share the results here too. It is interesting to see that the Yolov3-Tiny-PRN performance comes close to the original Yolov3!

Training Graph

Precision: 0.89 Recall: 0.79 F1-Score: 0.83 IoU: 68.47

YOLOv4-Tiny

With the recent version of YOLOv4 it was interesting to see how good it performs against it's predecessor. Same precision, but better recall and IoU.

Training Graph

Precision: 0.89 Recall: 0.89 F1-Score: 0.89 IoU: 91.48

Testing

I could not test the model on the same dataset as for example the Egohands dataset, because I mixed the training and testing samples together and created my own test-dataset out of it.

As soon as I have time, I will publish a comparison of my trained data vs. for example Handtracking.

Inferencing

The models have been trained on an image size 416x416. It is also possible to inference it with a lower model size to increase the speed. A good performance / accuracy mix on CPUs has been discovered by using an image size of 256x256.

The model itself is fully compatible with the opencv dnn module and just ready to use.

Demo

To run the demo, please first install all the dependencies (requirements.txt) into a virtual environment and download the model and weights into the model folder (or run the shell script).

# mac / linux
cd models && sh ./download-models.sh

# windows
cd models && powershell .\download-models.ps1

Then run the following command to start a webcam detector with YOLOv3:

# with python 3
python demo_webcam.py

Or this one to run a webcam detrector with YOLOv3 tiny:

# with python 3
python demo_webcam.py -n tiny

For YOLOv3-Tiny-PRN use the following command:

# with python 3
python demo_webcam.py -n prn

For YOLOv4-Tiny use the following command:

# with python 3
python demo_webcam.py -n v4-tiny

Download

If you are interested in the CMU Hand DB results, please check the release section.

About

Trained by cansik, datasets are described in the readme and fall under the terms and conditions of their owners.

All the demo images have been downloaded from unsplash.com:

Tim Marshall, Zachary Nelson, John Torcasio, Andy Falconer, Sherise, Alexis Brown

More Repositories

1

architectural-floor-plan

AFPlan is an architectural floor plan analysis and recognition system to create extended plans for building services.
Kotlin
297
star
2

processing-postfx

A shader based postFX library for processing.
Java
139
star
3

kotlin-latex-listing

A syntax highlighting template for the Kotlin language in LaTeX listings.
TeX
114
star
4

artnet4j

Art-Net DMX over IP library for Java and Processing
Java
82
star
5

deep-vision-processing

Deep computer-vision algorithms for the Processing framework.
Java
79
star
6

realsense-processing

Intel RealSense 2 support for the Processing framework.
Java
74
star
7

mediapipe-silicon

Prebuilt Google MediaPipe packages for arm64.
Shell
69
star
8

LongLiveTheSquare

An algorithm to find the minimum bounding box.
C#
65
star
9

processing-bloom-filter

An example of a bloom filter as post fx in processing3.
GLSL
49
star
10

sharp-frame-extractor

Extracts sharp frames from a video.
Python
47
star
11

esp-dmx-max485

An example on how to send dmx over a max485 with an ESP8622 and ESP32.
C++
36
star
12

mediapipe-osc

MediaPipe examples which stream their detections over OSC.
Python
32
star
13

nanobind-stubgen

Generate python stub files for code completion in IDEs for nanobind modules.
Python
31
star
14

pyrealsense2-macosx

Prebuilt pyrealsense2 packages for macOSX.
PowerShell
30
star
15

onnxruntime-silicon

ONNX Runtime prebuilt wheels for Apple Silicon (M1 / ARM64)
Python
28
star
16

mesh-sequence-player

A simple mesh sequence player based on open3d.
Python
23
star
17

yolo-mask-detection

A pre-trained YOLOv3 based mask detection network.
Python
21
star
18

smooth-servo

An processing library for smooth servo control.
Processing
21
star
19

pointcloud-processing

A point cloud visualisation and analysis library for Processing.
Java
19
star
20

madmapperapi

This is a documentation and example page for the MadMapper API.
16
star
21

sharp-tinder

Sharp-Tinder is a basic client framework for the very popular tinder dating app.
C#
13
star
22

librealsense-java

Intelยฎ RealSenseโ„ข SDK 2 wrapper for Java.
Java
13
star
23

opencv-processing

OpenCV for Processing. A creative coding computer vision library based on the official OpenCV Java API
Java
12
star
24

ArtNet3DotNet

ArtNet 3 Implementation for .NET in C#
C#
12
star
25

realsense-unity-mac

An example for using the Intel Realsense camera with Unity under MacOS.
C#
11
star
26

realsense-pose-detector

A simple pose detector which runs with the realsense framework.
Python
10
star
27

multi-pointcloud-example

An example on combining multiple pointclouds from different scans in unity.
C#
8
star
28

pointcloud-to-scenekit

PLY Pointcloud to Apple SceneKit converter.
Swift
8
star
29

openvino-arm

Prebuilt openvino packages for arm64.
Shell
8
star
30

space-stream

An example which streams RGB-D images over spout or syphon
Python
8
star
31

auto-keypoint-retopology

Auto keypoint extractor for retopology.
Python
7
star
32

p5js-pointcloud

Pointcloud viewer proof of concept for p5js based on webGL
JavaScript
7
star
33

SimplexSolver

A simple simplex algorithm implemented in java.
Java
7
star
34

processing-imageglitch

Processing image glitch is a small library to add glitch to a PImage. The algorithm uses the singularity of jpeg compression, which allows to display even an corrupted image.
Processing
6
star
35

pg4nosql

A simple psycopg2 based wrapper for nosql like database interaction with python.
Python
6
star
36

seqosc

Sequencer for OSC to record and playback OSC messages.
Kotlin
5
star
37

temporal-shift

Spatial based storytelling by the use of motion based interaction.
Swift
5
star
38

sweep-3d-lidar

A prototype of the sweep lidar scanner used for 3d scanning.
Processing
5
star
39

esp-ota-updater

App to update ESP32 and ESP8266 microchips over-the-air.
C#
5
star
40

webcam-capture-processing

A very basic webcam capture solution for processing.
Java
4
star
41

tvver-commercial

The commercial detection algorithm (CODA) is a simple way to detect similar patterns in a movie.
Java
4
star
42

syphon-python

Python wrapper for the GPU texture sharing framework Syphon (Metal & OpenGL).
Python
4
star
43

spatial-interaction-examples

All the examples for the course spatial interaction.
Processing
4
star
44

pymesh2depthmap

A simple tool to create depth-maps out of 3d meshes.
Python
4
star
45

ble-serial-bridge

A firmware to use an ESP32 as a BLE device controlled over serial commands.
Java
4
star
46

opencv-sample-annotator

A simple tool to create positive and negative samples for cascading classifiers.
Kotlin
4
star
47

floor-plan-object-detection

Object detection on architectural floor plans with deep learning.
4
star
48

unity-quest-pointcloud

An example on how to display a pointcloud on the oculus quest with unity.
C#
4
star
49

sweep-processing

Sweep Library for Processing
Java
4
star
50

mediapipe-face-geometry

MediaPipe Face Geometry example in python.
Python
4
star
51

mqtt-protobuf-example

A simple MQTT and Protobuf example in C#.
C#
4
star
52

syphon-live-loop

Live loop software for syphon.
Kotlin
4
star
53

ar-pointcloud-example

View pointclouds in AR.
Swift
3
star
54

visual-push-x

Visual Push X is the next level VJ software which connects to the Ableton Push controller.
Kotlin
3
star
55

sonic-touch-localization

Touch localization on a tabletop with the use of multiple microphones.
Java
3
star
56

ios-depth-recorder

A simple depth image recorder for ios.
Swift
3
star
57

sdent-universe

A space gameboy game.
C
3
star
58

duit

Duit is a Python library that provides a set of tools for working with data in a structured and efficient way.
Python
3
star
59

comgr-particle-swarm

A 3d particle swarm with the ability to align itself to different figures.
Java
3
star
60

eyetracking-poc

Eye-tracking proof of concept in processing.
Processing
3
star
61

unity-mixamo-importer

A simple FBX importer for Mixamo rigged avatars.
C#
3
star
62

multi-pose-landmark-mediapipe

MediaPipe multi pose tracking graph implemented with python.
Python
3
star
63

mediapipe-extended

Mediapipe for python with extended solution support.
PowerShell
3
star
64

lasercut-case-fusion360

Fusion360 script to create lasercut cases.
Python
3
star
65

opendata-transport

This is an implementation of the Open Data Transport API in Java.
Java
2
star
66

processing4-tool-template

A basic template to create processing4 tools.
Kotlin
2
star
67

brush-gcode

A simple post-processor for gcode to enable brush dipping.
Python
2
star
68

topography-pcl-viewer

A simple topography pointcloud viewer.
Processing
2
star
69

processing-pose

Skeleton Tracking for Processing Example
Kotlin
2
star
70

midas-converter

Utility to convert RGB frames into depth frames by using MiDas.
Python
2
star
71

eucliddatasimulator

IP5 Euclid Data Simulator
Python
2
star
72

mac-to-name

A method to generate a unique name out of a mac address.
C
2
star
73

openrndr-pointcloud

Simple pointcloud rendering with openRNDR
Kotlin
2
star
74

deep-learning-playground

Just some scripts and tools which I use for deep learning (with focus on object detection).
Python
2
star
75

serial-osc-example

A simple example of SerialOSC with Processing and Arduino communicating.
Arduino
2
star
76

esp-osc-link

ESP8266 Open Sound Control (OSC) Uplink for Microcontrollers
C++
2
star
77

deep-vision-installation

Deep Vision is an art installation created as final artefact of my master thesis in design.
C++
2
star
78

autonomous-turtle-os

This is a prototype repository of the self-driving turtle os.
C++
2
star
79

data-storage

A repository to host data used in various projects.
2
star
80

sensitized

An example on how to use yolov2 from open cv within processing.
Kotlin
2
star
81

image-perspective-transform

Transform the viewangle of an image onto another image of the same object.
Kotlin
2
star
82

three-scan

Three Scan is a three dimensional LIDAR scanner.
C++
2
star
83

FlockingAlgorithm

A flocking algorithm with the ability to find objects.
Processing
2
star
84

visiongraph-pose-estimation

A visiongraph based pose estimator example with performance measurement.
Python
2
star
85

mirror-vr-player

Web VR video player which can be synced / mirrored to other clients.
Kotlin
2
star
86

onnx-processing-demo

A simple demo on how to run ONNX vision models in processing.
Java
2
star
87

zurich-events-visualizer

A python script to visualize events from Zurich.
HTML
2
star
88

unity-leap-pose-recognizer

Unity leapmotion pose recognizer with wekinator.
C#
1
star
89

abstract-diagram-syntax

An abstract diagram syntax for flowchart.
Kotlin
1
star
90

vibration-stroke-illusion

An experiment to simulate stroking with machines.
Processing
1
star
91

iad-embodied-interaction

Materials for the data embodiment input in the embodied interaction BA course.
Processing
1
star
92

achtungkurve-sound

Sound programming for the ยซAchtung, die Kurve!ยป HTML5 game.
JavaScript
1
star
93

darwinStebs

Student Training 8 bit Simulator for Mac OSX
C#
1
star
94

human-pose-stream

A simple human pose estimation software that streams the pose data over OSC.
C++
1
star
95

haptic-depth-creator

Create haptic depth printable pictures.
Processing
1
star
96

realsense-rig-simulator

A simpel rig simulator for D435 cameras.
Processing
1
star
97

restroom-harmony

Audio and interactive restroom installation.
Processing
1
star
98

bildspur-base

A library with basic components for creative applications in kotlin.
Kotlin
1
star
99

mirrorly

Mirror your main screen to your second screen for presentations.
C++
1
star
100

pyunicon

PyUnicon is a cross platform library to control keyboard and mouse.
Python
1
star