Introduction
This is a PyTorch implementation of FOTS.
- ICDAR Dataset
- SynthText 800K Dataset
- detection branch
- recognition branch
- eval
- multi-gpu training
- reasonable project structure
- wandb
- pytorch_lightning
- eval with different scales
- OHEM
Instruction
Requirements
conda create --name fots --file spec-file.txt
conda activate fots
pip install -r reqs.txt
cd FOTS/rroi_align
python build.py develop
Training
# quite easy, for single gpu training set gpus to [0]. 0 is the id of your gpu.
python train.py -c pretrain.json
python train.py -c finetune.json
Evaluation
python eval.py
-c finetune.json
-m <your ckpt>
-i <icdar2015 folder contains train and test>
--detection
-o ./results
--cuda
--size "1280 720"
--bs 2
--gpu 1
with --detection
flag to evaluate detection only or without flag to evaluate e2e
Benchmarking and Models
Belows are E2E Generic benchmarking results on the ICDAR2015. I pretrained on Synthtext (7 epochs). Pretrained model (code: 68ta). Finetuned (5000 epochs) model (code: s38c).
Name | Backbone | Scale (W * H) | Hmean |
---|---|---|---|
FOTS (paper) | Resnet 50 | 2240 * 1260 | 60.8 |
FOTS (ours) | Resnet 50 | 2240 * 1260 | 46.2 |
FOTS RT (paper) | Resnet 34 | 1280 * 720 | 51.4 |
FOTS RT (Ours) | Resnet 50 | 1280 * 720 | 47 |
Samples
Acknowledgement
- https://github.com/SakuraRiven/EAST (Some codes are copied from here.)
- https://github.com/chenjun2hao/FOTS.pytorch.git (ROIRotate)