• Stars
    star
    1,132
  • Rank 39,620 (Top 0.9 %)
  • Language
    JavaScript
  • License
    MIT License
  • Created about 9 years ago
  • Updated almost 8 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Neural network OCR.

OCR

Trains a multi-layer perceptron (MLP) neural network to perform optical character recognition (OCR).

The training set is automatically generated using a heavily modified version of the captcha-generator node-captcha. Support for the MNIST handwritten digit database has been added recently (see performance section).

The network takes a one-dimensional binary array (default 20 * 20 = 400-bit) as input and outputs an 10-bit array of probabilities, which can be converted into a character code. Initial performance measurements show promising success rates.

After training, the network is saved as a standalone module to ./ocr.js, which can then be used in your project like this (from test.js):

var predict = require('./ocr.js');

// a binary array that we want to predict
var one = [
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
];

// the prediction is an array of probabilities
var prediction = predict(one);

// the index with the maximum probability is the best guess
console.log('prediction:', prediction.indexOf(Math.max.apply(null, prediction)));
// will hopefully output 1 if trained with 0-9 :)

Usage

Clone this repository. The script is using canvas, so you'll need to install the Cairo rendering engine. On OS X, assuming you have Homebrew installed, this can be done with the following (copied from canvas README):

$ brew install pkg-config cairo jpeg giflib

Then install npm dependencies and test it:

$ npm install
$ node main.js
$ node test.js

Performance

All runs below were performed with a MacBook Pro Retina 13" Early 2015 with 8GB RAM.

MNIST [0-9]

To test with the MNIST dataset: click on the title above, download the 4 data files and put them in a folder called mnist in the root directory of this repository.

// config.json
{
  "mnist": true,
  "network": {
    "hidden": 160,
    "learning_rate": 0.03
  }
}

Then run

$ node mnist.js
  • Neurons
    • 400 input
    • 160 hidden
    • 10 output
  • Learning rate: 0.03
  • Training set: 60000 digits
  • Testing set: 10000 digits
  • Training time: 21 min 53 s 753 ms
  • Success rate: 95.16%

[A-Za-z0-9]

// config.json
{
  "mnist": false,
  "text": "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ012356789",
  "fonts": [
    "sans-serif",
    "serif"
  ],
  "training_set": 2000,
  "testing_set": 1000,
  "image_size": 16,
  "threshold": 400,
  "network": {
    "hidden": 60,
    "learning_rate": 0.1,
    "output": 62
  }
}
  • Neurons
    • 256 input
    • 60 hidden
    • 62 output
  • Learning rate: 0.03
  • Training set
    • Size: 124000 characters
    • Sample: abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789
  • Testing set: 62000 characters
  • Training time: 8 min 18 s 560 ms
  • Success rate: 93.58225806451614%

[a-z]

// config.json
{
  "mnist": false,
  "text": "abcdefghijklmnopqrstuvwxyz",
  "fonts": [
    "sans-serif",
    "serif"
  ],
  "training_set": 2000,
  "testing_set": 1000,
  "image_size": 16,
  "threshold": 400,
  "network": {
    "hidden": 40,
    "learning_rate": 0.1,
    "output": 26
  }
}
  • Neurons
    • 256 input
    • 40 hidden
    • 26 output
  • Learning rate: 0.1
  • Training set
    • Size: 52000 characters
    • Sample: abcdefghijklmnopqrstuvwxyz
  • Testing set: 26000 characters
  • Training time: 1 min 55 s 414 ms
  • Success rate: 93.83846153846153%

[0-9]

// config.json
{
  "mnist": false,
  "text": "0123456789",
  "fonts": [
    "sans-serif",
    "serif"
  ],
  "training_set": 2000,
  "testing_set": 1000,
  "image_size": 16,
  "threshold": 400,
  "network": {
    "hidden": 40,
    "learning_rate": 0.1
  }
}
  • Neurons
    • 256 input
    • 40 hidden
    • 10 output
  • Learning rate: 0.1
  • Training set
    • Size: 20000 digits
    • Sample: 0123456789
  • Testing set: 10000 digits
  • Training time: 0 min 44 s 363 ms
  • Success rate: 99.59%

Configuration

Tweak the network for your needs by editing the config.json file located in the main folder. Pasted below is the default config file.

// config.json
{
  "mnist": false,
  "text": "0123456789",
  "fonts": [
    "sans-serif",
    "serif"
  ],
  "training_set": 2000,
  "testing_set": 1000,
  "image_size": 16,
  "threshold": 400,
  "network": {
    "hidden": 40,
    "learning_rate": 0.1
  }
}
  • mnist
    • If set to true, the MNIST handwritten digit dataset will be used for training and testing the network. This setting will overwrite configured set sizes and will ignore the image_size, threshold, fonts and text settings.
  • text
    • A string containing the glyphs with which to train/test the network.
  • fonts
    • An array of fonts to be used when generating images.
  • training_set
    • Number of images to be generated and used as the network training set.
  • testing_set
    • Same as above, but these images are used for testing the network.
  • image_size
    • The size of the square chunk (in pixels) containing a glyph. The resulting network input size is image_size^2.
  • threshold
    • When analyzing the pixels of a glyph, the algorithm reduces each pixel (r, g, b) to (r + g + b) and everything below threshold is marked as 1 in the resulting binary array used as network input.
  • network
    • hidden
      • The size (number of neurons) of the hidden layer of the network.
    • learning_rate
      • The learning rate of the network.

More Repositories

1

vectorious

Linear algebra in TypeScript.
TypeScript
911
star
2

sshync

Auto-sync files or directories over SSH.
JavaScript
751
star
3

issuance

Blogging with Github Issues.
HTML
242
star
4

hopfield-colors

Trains a Hopfield recurrent neural network to recognize colors and uses it to interpret images.
JavaScript
130
star
5

cryptochat

Encrypted P2P chat over ICMP.
JavaScript
80
star
6

keyword-miner

Extract a list of keywords from a website, sorted by word count.
JavaScript
50
star
7

matplotnode

C++ bindings for Node.js exposing a subset of matplotlib's functionality through the CPython API.
C++
38
star
8

waves-js

Wave-like text animation in pure JavaScript
CSS
26
star
9

cdefs

Describe C function prototypes in JSON.
C
24
star
10

scrabbler

Efficiently generates all valid english words from a given combination of letters and scores them according to scrabble rules
JavaScript
23
star
11

griderator

node.js grid generator
JavaScript
20
star
12

wordnet-visualization

A visualisation of the Princeton WordNet database
JavaScript
15
star
13

domp

Web scraping, crawling and DOM tree manipulation for Node.js.
JavaScript
14
star
14

walk-js

walk traversal
JavaScript
9
star
15

pixels

Image processing library
JavaScript
7
star
16

rexer

Generate matching strings from regular expressions.
JavaScript
6
star
17

blockman

Block-based array manipulation.
JavaScript
5
star
18

benchmaster

Benchmark.js wrapper for quick Node.js benchmarking.
JavaScript
5
star
19

concept-extractor

A method of comparing a website's content with a specified concept.
JavaScript
5
star
20

neuroviz

Visualize neural networks with matrix heatmaps
HTML
4
star