Introduction to OCR with Python
by Dr. W.J.B. Mattingly
Introduction
Optical Character Recognition, or OCR, is a common task in many domains. The earliest OCR systems were designed to serve the vision impaired. Its modern application, however, has extended to a far wider population. The goal of OCR is to take an input image and output raw text while maintaining the structure of the text in the image. In othere words, its end-goal is to preserve the line breaks, paragraph segmentation, and other features of the structure of the text on the page.
This course is designed to teach you how to automate OCR in Python for optimized results. It is meant to function alongside this YouTube Series OCR in Python Tutorials
Organization of Textbook
Lesson | Name |
---|---|
01.01 | Introduction to OCR |
01.02 | Introduction to the Libraries |
01.03 | How to Install Libraries |
02.01 | The Basics of Pillow |
02.02 | The Basics of OpenCV |
02.03 | The Basics of Tesseract |
03.01 | Passing Pillow Images to OpenCV |
03.02 | The Basics of OpenCV |
03.03 | Manipulating the Image |
04.01 | Bounding Boxes |
04.02 | Extracting Bounding Boxes |
04.03 | Organizing Bounding Boxes |
05.01 | Parameters of Tesseract |
05.02 | Cleaning the Output of Tesseract |
06.01 | Workflow for Standard OCR of Text |
06.02 | Workflow for Ignoring Footnotes |
06.03 | Workflow for Tables |
07.01 | Tesseract with non-English |
07.02 | Tesseract with Early Modern Scripts |
07.03 | Tesseract with non-Latin Scripts |