• Stars
    star
    270
  • Rank 152,189 (Top 3 %)
  • Language
    C++
  • License
    MIT License
  • Created almost 5 years ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Deep relational reasoning graph network for arbitrary shape text detection; Accepted by CVPR 2020 (Oral). http://arxiv.org/abs/2003.07493

This is an implementation of “Deep relational reasoning graph network for arbitrary shape text detection”.

News

Prerequisites

python 3.7;
PyTorch 1.2.0;
Numpy >=1.16;
CUDA 10.1;
GCC >=9.0;
opencv-python < 4.5.0
NVIDIA GPU(with 10G or larger GPU memory for inference);

Compile

cd ./csrc and make
cd ./nmslib/lanms and make

Data Links

Note: download the data and put it under the data file

  1. CTW1500
  2. TD500
  3. Total-Text

Models

  • The trained models of Total-Text, CTW-1500 model, MSRA-TD500, MLT2017, Icdar2015 all in here.
    Google Drive or Baidu Drive (download code: cfat)

Train

cd tool
sh train_CTW1500.sh # run or other shell script 

you should modify the relevant training parameters according to the environment, such as gpu_id and input_size:

#!/bin/bash
cd ../
CUDA_LAUNCH_BLOCKING=1 python train_TextGraph.py --exp_name Ctw1500 --max_epoch 600 --batch_size 6 --gpu 0 --input_size 640 --optim SGD --lr 0.001 --start_epoch 0 --viz --net vgg 
# --resume pretrained/mlt2017_pretain/textgraph_vgg_100.pth ### load the pretrain model,  You should change this path to your own 

Eval

First, you can modify the relevant parameters in the config.py and option.py

python  eval_TextGraph.py # Testing single round model 
or 
python  batch_eval.py #  Testing multi round models 

Qualitative results(view)

screenshot1

screenshot

Citing the related works

@inproceedings{DBLP:conf/cvpr/ZhangZHLYWY20,
  author       = {Shi{-}Xue Zhang and
                  Xiaobin Zhu and
                  Jie{-}Bo Hou and
                  Chang Liu and
                  Chun Yang and
                  Hongfa Wang and
                  Xu{-}Cheng Yin},
  title        = {Deep Relational Reasoning Graph Network for Arbitrary Shape Text Detection},
  booktitle    = {2020 {IEEE/CVF} Conference on Computer Vision and Pattern Recognition,
                  {CVPR} 2020, Seattle, WA, USA, June 13-19, 2020},
  pages        = {9696--9705},
  publisher    = {Computer Vision Foundation / {IEEE}},
  year         = {2020},
  doi          = {10.1109/CVPR42600.2020.00972},
}

@inproceedings{DBLP:conf/iccv/Zhang0YWY21,
  author    = {Shi{-}Xue Zhang and
               Xiaobin Zhu and
               Chun Yang and
               Hongfa Wang and
               Xu{-}Cheng Yin},
  title     = {Adaptive Boundary Proposal Network for Arbitrary Shape Text Detection},
  booktitle = {2021 {IEEE/CVF} International Conference on Computer Vision, {ICCV} 2021, Montreal, QC, Canada, October 10-17, 2021},
  pages     = {1285--1294},
  publisher = {{IEEE}},
  year      = {2021},
}

@article{zhang2023arbitrary,
  title={Arbitrary shape text detection via boundary transformer},
  author={Zhang, Shi-Xue and Yang, Chun and Zhu, Xiaobin and Yin, Xu-Cheng},
  journal={IEEE Transactions on Multimedia},
  year={2023},
  publisher={IEEE}
}

@article{DBLP:journals/pami/ZhangZCHY23,
  author       = {Shi{-}Xue Zhang and
                  Xiaobin Zhu and
                  Lei Chen and
                  Jie{-}Bo Hou and
                  Xu{-}Cheng Yin},
  title        = {Arbitrary Shape Text Detection via Segmentation With Probability Maps},
  journal      = {{IEEE} Trans. Pattern Anal. Mach. Intell.},
  volume       = {45},
  number       = {3},
  pages        = {2736--2750},
  year         = {2023},
  url          = {https://doi.org/10.1109/TPAMI.2022.3176122},
  doi          = {10.1109/TPAMI.2022.3176122},
}

@article{zhang2022kernel,
  title={Kernel proposal network for arbitrary shape text detection},
  author={Zhang, Shi-Xue and Zhu, Xiaobin and Hou, Jie-Bo and Yang, Chun and Yin, Xu-Cheng},
  journal={IEEE Transactions on Neural Networks and Learning Systems},
  year={2022},
  publisher={IEEE}
}

License

This project is licensed under the MIT License - see the LICENSE.md file for details

More Repositories

1

TextBPN-Plus-Plus

Arbitrary Shape Text Detection via Boundary Transformer;The paper at: https://arxiv.org/abs/2205.05320, which has been accepted by IEEE Transactions on Multimedia (T-MM 2023).
Python
174
star
2

TextBPN

Adaptive Boundary Proposal Network for Arbitrary Shape Text Detection; Accepted by ICCV2021;The paper at: http://arxiv.org/abs/2107.12664
Python
113
star
3

TextPMs

Arbitrary Shape Text Detection via Segmentation with Probability Maps; accepted by TPAMI2022
C++
96
star
4

GHM_Loss

The tensorflow implementation of GHM loss include class loss and regression loss. GHM loss is peoposed in "Gradient Harmonized Single-stage Detector" published on AAAI 2019 (Oral).
Python
65
star
5

OHEM-loss

tensorflow implementation of OHEM loss and Support the sigmoid or softmax entropy loss
Python
31
star
6

GloRe

Tensorflow implementation of Global Reasoning unit (GloRe) from Graph-Based Global Reasoning Networks. GCN Network Blok
Python
28
star
7

Focal-loss

The code is tensorflow implement for focal loss for Dense Object Detection. https://arxiv.org/abs/1708.02002
Python
20
star
8

AnalysisEEG

2020年研究生数学建模竞赛C题-脑电波分析(代码及数据)
Python
9
star
9

TaggingTool

An annotation tool for target detection and text detection, which supports both image and video media files and only supports Windows system environment. labelMe, Tagging, Annotation.
C#
8
star
10

STGT

Video-Language Alignment via Spatio–Temporal Graph Transformer; ArXiv: https://arxiv.org/abs/2407.11677
Python
6
star
11

TextFormat_To_cocoJson

converting detection txt format for COCO json format
Python
4
star
12

GXYM.github.io

HTML
2
star
13

python-Interface-Cpp

Interface python code by C++ , support python3
Makefile
2
star