penteract
The native Node.js bindings to the Tesseract OCR project.
- Using Node.js bindings, avoid spawning
tesseract
command line. - Asynchronous I/O: Image reading and processing in insulated event loop backed by libuv.
- Support to read image data from JavaScript
buffer
s.
Contributions are welcome.
Install
First of all, a g++ 4.9 compiler is required.
Before install penteract
, the following dependencies should be installed
$ brew install pkg-config tesseract # mac os
Then npm install
$ npm install penteract
To Use with Electron
Due to the limitation of node native modules, if you want to use penteract
with electron, add a .npmrc
file to the root of your electron project, before npm install
:
runtime = electron
; The version of the local electron,
; use `npm ls electron` to figure it out
target = 1.7.5
target_arch = x64
disturl = https://atom.io/download/atom-shell
Usage
Recognize an Image Buffer
import {
recognize
} from 'penteract'
import fs from 'fs-extra'
const filepath = path.join(__dirname, 'test', 'fixtures', 'penteract.jpg')
fs.readFile(filepath).then(recognize).then(console.log) // 'penteract'
Recognize a Local Image File
import {
fromFile
} from 'penteract'
fromFile(filepath, {lang: 'eng'}).then(console.log) // 'penteract'
recognize(image [, options])
- image
Buffer
the content buffer of the image file. - options
PenteractOptions=
optional
Returns Promise.<String>
the recognized text if succeeded.
fromFile(filepath [, options])
- filepath
Path
the file path of the image file. - options
PenteractOptions=
Returns Promise.<String>
PenteractOptions
Object
{
// @type `(String|Array.<String>)=eng`,
//
// Specifies language(s) used for OCR.
// Run `tesseract --list-langs` in command line for all supported languages.
// Defaults to `'eng'`.
//
// To specify multiple languages, use an array.
// English and Simplified Chinese, for example:
// ```
// lang: ['eng', 'chi_sim']
// ```
lang: 'eng'
}
Promise.reject(error)
- error
Error
The JavaScriptError
instance- code
String
Error code. - message
String
Error message. - other properties of
Error
.
- code
ERR_READ_IMAGE
code: Rejects if it fails to read image data from file or buffer.
ERR_INIT_TESSER
code: Rejects if tesseract fails to initialize
Example of Using with Electron
// For details of `mainWindow: BrowserWindow`, see
// https://github.com/electron/electron/blob/master/docs/api/browser-window.md
mainWindow.capturePage({
x: 10,
y: 10,
width: 100,
height: 10
}, (data) => {
recognize(data.toPNG()).then(console.log)
})
Compiling Troubles
For Mac OS users, if you are experiencing trouble when compiling, run the following command:
$ xcode-select --install
will resolve most problems.
Warnings:
xcode-select: error: tool 'xcodebuild' requires Xcode, but active developer directory '/Library/Developer/CommandLineTools' is a command line tools instance
resolver:
$ sudo xcode-select -s /Applications/Xcode.app/Contents/Developer
License
MIT