• This repository has been archived on 04/Jul/2018
  • Stars
    star
    583
  • Rank 76,663 (Top 2 %)
  • Language
    C++
  • License
    MIT License
  • Created about 12 years ago
  • Updated about 10 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Tesseract OCR for iOS

Tesseract for iOS

tesseract-ios is not actively maintained anymore. I encourage you to use gali8's Tesseract-OCR-iOS instead.

About

Tesseract-ios is an Objective-C wrapper for Tesseract OCR.

This project couldn't exist without the ร‚ngelo Suzuki's blog post. A lot of code came from his article.

Requirements

  • iOS SDK 6.0, iOS 5.0+ (there is no support for armv6)
  • Tesseract and Leptonica libraries from the tesseract-ios-lib repo.

Installation

  • Add tesseract-ios as a group, and tessdata by reference to your project:

  • Go to your project settings, and ensure that C++ Standard Library => libstdc++:

Usage

Here is the default workflow to extract text from an image:

  • Instantiate Tesseract with data path and language
  • Set variables (character set, โ€ฆ)
  • Set the image to analyze
  • Start recognition
  • Get recognized text
  • Clear

Code Sample

#import "Tesseract.h"

Tesseract* tesseract = [[Tesseract alloc] initWithDataPath:@"tessdata" language:@"eng"];
[tesseract setVariableValue:@"0123456789" forKey:@"tessedit_char_whitelist"];
[tesseract setImage:[UIImage imageNamed:@"image_sample.jpg"]];
[tesseract recognize];

NSLog(@"%@", [tesseract recognizedText]);
[tesseract clear];

Method reference

-initWithDataPath:language:

- (id)initWithDataPath:(NSString *)dataPath language:(NSString *)language

Initialize a new Tesseract instance.

  • dataPath: a relative path from the application bundle to the .traineddata files. You can find these files from the tesseract downloads section.
  • language: language used for recognition. Ex: eng. Tesseract will search for a eng.traineddata file in the dataPath directory.

Returns nil if instanciation failed.

-setVariableValue:forKey:

- (void)setVariableValue:(NSString *)value forKey:(NSString *)key

Set Tesseract variable key to value. See http://www.sk-spell.sk.cx/tesseract-ocr-en-variables for a complete (but not up-to-date) list.

For instance, use tessedit_char_whitelist to restrict characters to a specific set.

-setImage:

- (void)setImage:(UIImage *)image

Set the image to recognize.

-setLanguage:

- (BOOL)setLanguage:(NSString *)language

Override the language defined with -initWithDataPath:language:.

-recognize

- (BOOL)recognize

Start text recognition. You might want to launch this process in background with NSObject's -performSelectorInBackground:withObject:.

-recognizedText

- (NSString *)recognizedText

Get the text extracted from the image.

-clear

- (void) clear

Clears Tesseract object after text has been recognized from image. Preventing memory leaks.