• Stars
    star
    475
  • Rank 91,834 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 5 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Code for the paper "DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks" (ICCV '19)

DewarpNet

This repository contains the codes for DewarpNet training.

Recent Updates

  • [May, 2020] Added evaluation images and an important note about Matlab SSIM.
  • [Dec, 2020] Added OCR evaluation details.
  • [Sep, 2021] Released DewarpNet final models used in the paper.

Training

  • Prepare Data: train.txt & val.txt. Contents should be like:
1/824_8-cp_Page_0503-7Ns0001
1/824_1-cp_Page_0504-2Cw0001
  • Train Shape Network: python trainwc.py --arch unetnc --data_path ./data/DewarpNet/doc3d/ --batch_size 50 --tboard
  • Train Texture Mapping Network: python trainbm.py --arch dnetccnl --img_rows 128 --img_cols 128 --img_norm --n_epoch 250 --batch_size 50 --l_rate 0.0001 --tboard --data_path ./DewarpNet/doc3d

Inference:

  • Run: python infer.py --wc_model_path ./eval/models/unetnc_doc3d.pkl --bm_model_path ./eval/models/dnetccnl_doc3d.pkl --show

Evaluation (Image Metrics):

  • We use the same evaluation code as DocUNet. To reproduce the quantitative results reported in the paper use the images available here.

  • [Important note about Matlab version] We noticed that Matlab 2020a uses a different SSIM implementation which gives a better MS-SSIM score (0.5623). Whereas we have used Matlab 2018b. Please compare the scores according to your Matlab version.

Evaluation (OCR Metrics):

  • The 25 images used for OCR evaluation is /eval/ocr_eval/ocr_files.txt
  • The corresponding ground-truth text is given in /eval/ocr_eval/tess_gt.json
  • For the OCR errors reported in the paper we had used cv2.blur as pre-processing which gives higher error in all the cases. For convenience, we provide the updated numbers (without using blur) in the following table:
Method ED CER ED (no blur) CER (no blur)
DocUNet 1975.86 0.4656(0.263) 1671.80 0.403 (0.256)
DocUNet on Doc3D 1684.34 0.3955 (0.272) 1296.00 0.294 (0.235)
DewarpNet 1288.60 0.3136 (0.248) 1007.28 0.249 (0.236)
DewarpNet (ref) 1114.40 0.2692 (0.234) 812.48 0.204 (0.228)
  • We had used the Tesseract (v4.1.0) default configuration for evaluation with PyTesseract (v0.2.6).

Models:

  • Pre-trained models are available here. These models are captured prior to end-to-end training, thus won't give you the end-to-end results reported in Table 2 of the paper. Use the images provided above to get the exact numbers as Table 2.
  • Final models are available here. These models can be used to unwarp DocUNet images and reproduce the results in the ICCV paper.

Dataset:

  • The doc3D dataset can be downloaded using the scripts here.

More Stuff:

Citation:

If you use the dataset or this code, please consider citing our work-

@inproceedings{SagnikKeICCV2019, 
Author = {Sagnik Das*, Ke Ma*, Zhixin Shu, Dimitris Samaras, Roy Shilkrot}, 
Booktitle = {Proceedings of International Conference on Computer Vision}, 
Title = {DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks}, 
Year = {2019}}   

Acknowledgements:

More Repositories

1

LearningToCountEverything

Python
342
star
2

DM-Count

Code for NeurIPS 2020 paper: Distribution Matching for Crowd Counting.
Python
204
star
3

doc3D-dataset

A hybrid dataset for document unwarping (Paper: https://www3.cs.stonybrook.edu/~cvl/projects/dewarpnet/storage/paper.pdf)
Shell
155
star
4

SID

Official implementation for ICCV19 "Shadow Removal via Shadow Image Decomposition"
Jupyter Notebook
96
star
5

PaperEdge

The code and the DIW dataset for "Learning From Documents in the Wild to Improve Document Unwarping" (SIGGRAPH 2022)
Python
83
star
6

Scanpath_Prediction

Predicting Goal-directed Human Attention Using Inverse Reinforcement Learning (CVPR2020)
Python
76
star
7

BodyHands

Whose Hands Are These? Hand Detection and Hand-Body Association in the Wild, CVPR 2022
Python
70
star
8

zero-shot-counting

CVPR2023 Zero-shot Counting
Python
49
star
9

ContactHands

Detecting Hands and Recognizing Physical Contact in the Wild, NeurIPS 2020.
Python
45
star
10

fsl-rsvae

Python
34
star
11

DocIIW

Repository for Intrinsic Decomposition of Document Images In-the-Wild (BMVC '20)
Python
34
star
12

EmotionNet_CVPR2020

Python
30
star
13

local_learning_wsi

Repository for "Gigapixel Whole-Slide Images Classification using Locally Supervised Learning"
Python
27
star
14

PathLDM

Official Code for PathLDM: Text conditioned Latent Diffusion Model for Histopathology (WACV 2024)
Jupyter Notebook
27
star
15

SAMPath

Repository for "SAM-Path: A Segment Anything Model for Semantic Segmentation in Digital Pathology" (MedAGI2023, MICCAI2023 workshop)
Python
22
star
16

HandLer

Forward Propagation, Backward Regression and Pose Association for Hand Tracking in the Wild (CVPR 2022)
Python
20
star
17

scenes100

Python
20
star
18

Large-Image-Diffusion

CVPR 2024: Learned representation-guided diffusion models for large-image generation
Jupyter Notebook
20
star
19

vfd-iccv21

Python
19
star
20

PromptMIL

Repository for "Prompt-MIL: Boosting Multi-Instance Learning Schemes via Task-specific Prompt Tuning" (MICCAI2023)
Python
16
star
21

SelfMedMAE

Code for ISBI 2023 paper "Self Pre-training with Masked Autoencoders for Medical Image Classification and Segmentation"
Python
16
star
22

Iso-UVField

Learning an Isometric Surface Parameterization for Texture Unwrapping (ECCV 2022)
Python
13
star
23

Emotion-Prediction

Visual Emotion Prediction (as a single-label problem) -- MS Thesis
Python
12
star
24

Gazeformer

Official codebase for "Gazeformer: Scalable, Effective and Fast Prediction of Goal-Directed Human Attention" (CVPR 2023)
Python
12
star
25

Target-absent-Human-Attention

Target-absent Human Attention (ECCV2022)
Python
11
star
26

PLM_SSL

Repository for "Precise Location Matching Improves Dense Contrastive Learning in Digital Pathology"
Python
8
star
27

LSAE

PyTorch Implementation of Lung Swapping Autoencoder
Python
7
star
28

fewshot-conditional-diffusion

Official code for "Conditional Generation from Unconditional Diffusion Models using Denoiser Representations" (BMVC 2023)
Jupyter Notebook
7
star
29

HyperMAE

Python
3
star
30

infinity-brush

2
star
31

EnEx

Code and datasets for BMVC 2021 paper "Exemplar-Based Early Event Prediction in Video"
Python
2
star
32

JEAN

2
star
33

hematopoiesis-relationvae

1
star
34

TokenSparse-for-MedSeg

Code for IPMI2023 paper "Token Sparsification for Faster Medical Image Segmentation"
1
star
35

GCDR-Gaze

Repository of the paper "Diffusion-Refined VQA Annotations for Semi-Supervised Gaze Following" (ECCV 2024)
1
star