• Stars
    star
    726
  • Rank 62,424 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 7 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Recognizing cropped text in natural images.

ASTER: Attentional Scene Text Recognizer with Flexible Rectification

ASTER is an accurate scene text recognizer with flexible rectification mechanism. The research paper can be found here.

ASTER Overview

The implementation of ASTER reuses code from Tensorflow Object Detection API.

Update

[07/13/2019] A PyTorch port has been made by @ayumiymk.

Correction (10/22/2018)

We have identified a bug we accidentally made in the code that causes only part of SVT images being tested and results in higher results. The bug has been fixed in commit a7e8613. Below are the corrected numbers on SVT. The results are still state-of-the-art, so the conclusions are not affected.

  • SVT (50) ASTER: 97.4%; ASTER-A: 96.3%; ASTER-B: 96.1%;
  • SVT (None): ASTER: 89.5%; ASTER-A: 80.2%; ASTER-B: 81.6%

Prerequisites

ASTER was developed and tested with TensorFlow r1.4. Higher versions may not work.

ASTER requires Protocol Buffers (version>=2.6). Besides, in Ubuntu 16.04:

sudo apt install cmake libcupti-dev
pip3 install --user protobuf tqdm numpy editdistance

Installation

  1. Go to c_ops/ and run build.sh to build the custom operators
  2. Execute protoc aster/protos/*.proto --python_out=. to build the protobuf files
  3. Add /path/to/aster to PYTHONPATH, or set this variable for every run

Demo

A demo program is located at aster/demo.py, accompanied with pretrained model files available on our release page. Download model-demo.zip and extract it under aster/experiments/demo/ before running the demo.

To run the demo, simply execute:

python3 aster/demo.py

This will output the recognition result of the demo image and the rectified image.

Training and on-the-fly evaluation

Data preparation scripts for several popular scene text datasets are located under aster/tools. See their source code for usage.

To run the example training, execute

python3 aster/train.py \
  --exp_dir experiments/demo \
  --num_clones 2

Change the configuration in experiments/aster/trainval.prototxt to configure your own training process.

During the training, you can run a separate program to repeatedly evaluates the produced checkpoints.

python3 aster/eval.py \
   --exp_dir experiments/demo

Evaluation configuration is also in trainval.prototxt.

Citation

If you find this project helpful for your research, please cite the following papers:

@article{bshi2018aster,
  author  = {Baoguang Shi and
               Mingkun Yang and
               Xinggang Wang and
               Pengyuan Lyu and
               Cong Yao and
               Xiang Bai},
  title   = {ASTER: An Attentional Scene Text Recognizer with Flexible Rectification},
  journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
  volume  = {}, 
  number  = {}, 
  pages   = {1-1},
  year    = {2018}, 
}

@inproceedings{ShiWLYB16,
  author    = {Baoguang Shi and
               Xinggang Wang and
               Pengyuan Lyu and
               Cong Yao and
               Xiang Bai},
  title     = {Robust Scene Text Recognition with Automatic Rectification},
  booktitle = {2016 {IEEE} Conference on Computer Vision and Pattern Recognition,
               {CVPR} 2016, Las Vegas, NV, USA, June 27-30, 2016},
  pages     = {4168--4176},
  year      = {2016}
}

IMPORTANT NOTICE: Although this software is licensed under MIT, our intention is to make it free for academic research purposes. If you are going to use it in a product, we suggest you contact us regarding possible patent issues.