CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition
The official code of CDistNet.
Paper Link : Arxiv Link
What's News
- [2023-08]🌟 Our paper is accepted by IJCV
- [2022-01]🌟 Our code is released in github
- [2021-11]🌟 The paper can be read in Arixv: http://arxiv.org/abs/2111.11011
To Do List
- HA-IC13 & CA-IC13
- Pre-train model
- Cleaned Code
- Document
- Distributed Training
Two New Datasets
we test other sota method in HA-IC13 and CA-IC13 datasets.
CDistNet has a performance advantage over other SOTA methods as the character distance increases (1-6)
HA-IC13
Method | 1 | 2 | 3 | 4 | 5 | 6 | Code & Pretrain model |
---|---|---|---|---|---|---|---|
VisionLAN (ICCV 2021) | 93.58 | 92.88 | 89.97 | 82.26 | 72.23 | 61.03 | Offical Code |
ABINet (CVPR 2021 ) | 95.92 | 95.22 | 91.95 | 85.76 | 73.75 | 64.99 | Offical Code |
RobustScanner* (ECCV 2020) | 96.15 | 95.33 | 93.23 | 88.91 | 81.10 | 71.53 | -- |
Transformer-baseline* | 96.27 | 95.45 | 92.42 | 86.46 | 79.35 | 72.46 | -- |
CDistNet | 96.62 | 96.15 | 94.28 | 89.96 | 83.43 | 77.71 | -- |
CA-IC13
Method | 1 | 2 | 3 | 4 | 5 | 6 | Code & Pretrain model |
---|---|---|---|---|---|---|---|
VisionLAN (ICCV 2021) | 94.87 | 92.77 | 84.01 | 75.03 | 64.29 | 52.74 | Offical Code |
ABINet (CVPR 2021 ) | 96.62 | 95.92 | 87.86 | 76.31 | 65.46 | 54.49 | Offical Code |
RobustScanner* (ECCV 2020) | 95.22 | 94.87 | 85.30 | 76.55 | 68.38 | 60.79 | -- |
Transformer-baseline* | 95.68 | 94.40 | 85.88 | 75.85 | 65.93 | 58.58 | -- |
CDistNet | 96.27 | 95.57 | 88.45 | 79.58 | 70.36 | 63.13 | -- |
Datasets
The datasets are same as ABINet
-
Training datasets
-
Evaluation & Test datasets, LMDB datasets can be downloaded from BaiduNetdisk(passwd:1dbv), GoogleDrive.
- ICDAR 2013 (IC13)
- ICDAR 2015 (IC15)
- IIIT5K Words (IIIT)
- Street View Text (SVT)
- Street View Text-Perspective (SVTP)
- CUTE80 (CUTE)
-
Augment IC13
- HA-IC13 & CA-IC13 : BaiduNetdisk(passwd:d6jd), GoogleDrive
-
The structure of
dataset
directory isdataset ├── eval │ ├── CUTE80 │ ├── IC13_857 │ ├── IC15_1811 │ ├── IIIT5k_3000 │ ├── SVT │ └── SVTP ├── train │ ├── MJ │ │ ├── MJ_test │ │ ├── MJ_train │ │ └── MJ_valid │ └── ST
Environment
package you can find in env_cdistnet.yaml
.
#Installed
conda create -n CDistNet python=3.7
conda install pytorch==1.5.1 torchvision==0.6.1 cudatoolkit=9.2 -c pytorch
pip install opencv-python mmcv notebook numpy einops tensorboardX Pillow thop timm tornado tqdm matplotlib lmdb
Pretrained Models
Get the pretrained models from BaiduNetdisk(passwd:d6jd), GoogleDrive.
(We both offer training log and result.csv in same file.)
The pretrained model should set in models/reconstruct_CDistNetv3_3_10
Performances of the pretrained models are summaried as follows:
Train
CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --config=configs/CDistNet_config.py
Eval
CUDA_VISIBLE_DEVICES=0 python eval.py --config=configs/CDistNet_config.py
Citation
@article{Zheng2021CDistNetPM,
title={CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition},
author={Tianlun Zheng and Zhineng Chen and Shancheng Fang and Hongtao Xie and Yu-Gang Jiang},
journal={ArXiv},
year={2021},
volume={abs/2111.11011}
}