Towards Accurate Scene Text Recognition with Semantic Reasoning Networks
Unofficial PyTorch implementation of the paper, which integrates not only global semantic reasoning module but also parallel visual attention module and visual-semantic fusion decoder.the semanti reasoning network(SRN) can be trained end-to-end.
At present, the accuracy of the paper cannot be achieved. And i borrowed code from deep-text-recognition-benchmark
result
IIIT5k_3000 | SVT | IC03_860 | IC03_867 | IC13_857 | IC13_1015 | IC15_1811 | IC15_2077 | SVTP | CUTE80 |
---|---|---|---|---|---|---|---|---|---|
84.600 | 83.617 | 92.907 | 92.849 | 90.315 | 88.177 | 71.010 | 68.064 | 71.008 | 68.641 |
total_accuracy: 80.597
Feature
- predict the character at once time
- DistributedDataParallel training
Requirements
Pytorch >= 1.1.0
Test
-
download the evaluation data from deep-text-recognition-benchmark
-
download the pretrained model from Baidu, Password: d2qn
-
test on the evaluation data
python test.py --eval_data path-to-data --saved_model path-to-model
Train
-
download the training data from deep-text-recognition-benchmark
-
training from scratch
python train.py --train_data path-to-train-data --valid-data path-to-valid-data
Reference
- bert_ocr.pytorch
- deep-text-recognition-benchmark
- 2D Attentional Irregular Scene Text Recognizer
- Towards Accurate Scene Text Recognition with Semantic Reasoning Networks
difference with the origin paper
- use resnet for 1D feature not resnetFpn 2D feature
- use add not gated unit for visual-semanti fusion decoder
other
It is difficult to achieve the accuracy of the paper, hope more people to try and share