• Stars
    star
    112
  • Rank 311,303 (Top 7 %)
  • Language
    Python
  • License
    GNU General Publi...
  • Created about 8 years ago
  • Updated about 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Optical table recognition - recognize tables in scan images using OpenCV

OTR

Optical table recognition - recognize tables in scan images using OpenCV.

OTR uses a raster-based method to recognize tables, even if they have e.g. dashed lines or the page is slightly skewed (such as when scanning a book). OTR can not be used for tables without a visible raster!

Install

Install OpenCV for Python3 e.g. sudo apt-get install python3-opencv in Ubuntu 18.04.

I recommend to install numpy & scipy from apt if you use a deb-based linux system to speed up the dependency install: sudo apt-get install python3-scipy python3-numpy

cv_algorithms is one of the dependencies. See there for some of the algorithms used in OTR in a reusable form.

sudo pip3 install -r requirements.txt

Run

Get a test image, e.g. google for images like "Old naval log table" and select one with a table. Can't share one here due to copyright but if you know a public domain one, please add it via a pull request.

python3 test-otr.py <image filename>

It's currently only a proof of concept. See Algorithm.pdf for details on how it works.

More Repositories

1

OCCUtils

OpenCASCADE utility library - algorithms and convenience functions
C++
51
star
2

UliEngineering

A python library for calculations perfomed in electronics engineering
Python
50
star
3

deb-buildscripts

Debian Buildscripts
Shell
44
star
4

ODBPy

ODB++ support for Python
Python
31
star
5

GuardMyWire

Generate wireguard configs for Linux and MikroTik devices
Python
23
star
6

cv_algorithms

Optimized OpenCV extra algorithms for Python2/3
Python
14
star
7

entropy-analysis-tools

C++
9
star
8

MeSH-JSON

Lightning-fast MeSH (NCBI Medical Subject Headings) database bulk data reader & JSON lines converter
C++
6
star
9

LabInstruments

PyVISA wrappers for some of my lab instruments: Rigol DL3021, Agilent DSO-X 3024
Python
5
star
10

STM8S-SDCC-SPL

STM8S Standard peripheral library for SDCC, split into individual files for small binaries.
C
5
star
11

KiCADBulkHideSilkscreenDesignators

KiCAD pcbnew plugin to hide silkscreen reference designators such as "R4" from one or multiple footprints using a single click
Python
4
star
12

cereal-text

Data.Text instances for Cereal serialization
Haskell
4
star
13

buildock

Reproducible build environments for local builds using Docker
Dockerfile
3
star
14

graph-generators

A Haskell library for creating random Data.Graph instances using several pop
Haskell
3
star
15

pseudo-perseus

Informative renderer for Khan academy's Perseus markdown format
JavaScript
2
star
16

Schoolalyzer

Java
2
star
17

Bachelor

My Bachelor's thesis, themed 'Algorithms for resource-constrained domain-specific knowledge management'
TeX
2
star
18

FixMyDNS

Tired of telling people how to manually configure the DNS servers to fix DNS issues on Windows? Just send them FixMyDNS.
C#
2
star
19

jbookmanager

Java
2
star
20

1x1

Java
2
star
21

Nexys3Features

VHDL snippets and projects for the Digilent Nexys3 FPGA board
VHDL
2
star
22

YakDB

Yet another Key-Value Database -- a fast yet simple key-value database layer based on RocksDB and ZeroMQ
C++
2
star
23

xraysim

C
2
star
24

KiCADSamacSysImporter

A python scripts that imports SamacSys ComponentSearchEngine ZIP files into a KiCAD project.
Python
2
star
25

ChipDBEditor

A desktop editor for ChipDB
Java
2
star
26

ZeitLos-Layout

2
star
27

Portfolio

1
star
28

KADeutschIssues

Khan Academy german translations issues
Python
1
star
29

ubuntu-python3-opencv-docker

Ubuntu docker image with preinstalled OpenCV for Python3
Dockerfile
1
star
30

SDCheck

Utility to check for fake SD cards
C++
1
star
31

jitsi-meet-conference-mapper

A simple Jitsi Meet Conference Mapper server in NodeJS
JavaScript
1
star
32

ep_etherlatex

Simple server-side PDF-LaTeX compiler plugin for Etherpad lite.
JavaScript
1
star
33

Facharbeit

1
star
34

JSpeedWriter

Java
1
star
35

PCBCheck

A script for checking PCB production data
Python
1
star
36

ForecastDiagnostics

Java
1
star
37

PFSenseAutomation

TamperMonkey script to automate processes in the PFSense web interface
JavaScript
1
star
38

InvenTreeQuickAdd

Web GUI & Server to quickly add electronics components to InvenTree using DigiKey / Mouser etc APIs
TypeScript
1
star
39

TechOverflowVisualizations

Visualizations for TechOverflow, mostly TiKz
TeX
1
star
40

Programmierpraktikum

My team's software for the Bioinformatics Programmierpraktikum at the Munich university
Java
1
star
41

SudokuSolver

Java
1
star
42

medigl

Medical GL application
C++
1
star
43

UliAcceleration

Fast accelerated math routines for Python
Python
1
star
44

stm8s-discovery-sdcc-blink

A minimal demo project demonstrating blink & build with SDCC & CMake
C
1
star
45

KiCAD-ProcessAutomation

Quality management & automation tools for KiCAD
Python
1
star
46

BeastHTTPServer

Easy-to-use basic C++ webserver library using boost::beast
C++
1
star
47

group-with

A Haskell library to classify objects by a function value, just like SQL GROUP BY
Haskell
1
star