• Stars
    star
    1,649
  • Rank 28,345 (Top 0.6 %)
  • Language
  • License
    MIT License
  • Created over 8 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A curated list of promising OCR resources

**make a update of daily paper tracking tool **

awesome-ocr

A curated list of promising OCR resources

daily ocr paper track

website source code you can modify or add new more keywords for tracking and sharing just edit this file https://github.com/wanghaisheng/ocr-arxiv-daily/blob/main/database/topic.yml make a request

AI-Paper-Collector

Fully-automated scripts for collecting AI-related papers. Support fuzzy and exact search for paper titles.

https://github.com/wanghaisheng/ocr-paper-collector

tweets contained ocr tracking

https://github.com/wanghaisheng/ocr-tweets-monitoring

Librarys

有2个api
都支持图片
百度自家的 :基本可以放弃
化验单识别:也只能提取化验单上三个字段的一个
第三方和阿里自己提供的 API 集中在身份证、银行卡、驾驶证、护照、电商商品评论文本、车牌、名片、贴吧文本、视频中的文本,多输出字符及相应坐标,卡片类可输出成结构化字段,价格在0.01左右
另外有三家提供了简历的解析,输出结果多为结构化字段,支持文档和图片格式 价格在0.1-0.3次不等
目前无第三方入驻,仅有腾讯自有的api 涵盖车牌、名片、身份证、驾驶证、银行卡、营业执照、通用印刷体,价格最高可达0.2左右。
OcrKing 从哪来?

OcrKing 源自2009年初 Aven 在数据挖掘中的自用项目,在对技术的执着和爱好的驱动下积累已近七载经多年的积累和迭代,如今已经进化为云架构的集多层神经网络与深度学习于一体的OCR识别系统2010年初为方便更多用户使用,特制作web版文字OCR识别,从始至今 OcrKing一直提供免费识别服务及开发接口,今后将继续提供免费云OCR识别服务。OcrKing从未做过推广,

但也确确实实默默地存在,因为他相信有需求的朋友肯定能找得到。欢迎把 OcrKing 在线识别介绍给您身边有类似需求的朋友!希望这个工具对你有用,谢谢各位的支持!

OcrKing 能做什么?

OcrKing 是一个免费的快速易用的在线云OCR平台,可以将PDF及图片中的内容识别出来,生成一个内容可编辑的文档。支持多种文件格式输入及输出,支持多语种(简体中文,繁体中文,英语,日语,韩语,德语,法语等)识别,支持多种识别方式, 支持多种系统平台, 支持多形式API调用!
超轻量级模型,大小低至9.4M.支持80+语言模型,且内置训练模块、半监督标注工具、板面分析模型,中文方面识别占优。
继承PaddleOCR优势,在此基础上提供了ONNX后端,并额外支持了DirectX加速支持,在Windows部署有显著的兼容性和性能优势。
Connectionist Temporal Classification is a loss function useful for performing supervised learning on sequence data, without needing an alignment between input data and labels. For example, CTC can be used to train end-to-end systems for speech recognition, which is how we have been using it at Baidu's Silicon Valley AI Lab.

Warp-CTC是一个可以应用在CPU和GPU上高效并行的CTC代码库 (library) 介绍 CTCConnectionist Temporal Classification作为一个损失函数,用于在序列数据上进行监督式学习,不需要对齐输入数据及标签。比如,CTC可以被用来训练端对端的语音识别系统,这正是我们在百度硅谷试验室所使用的方法。 端到端 系统 语音识别

检测单词,而不是检测出一个文本行

Papers

Building on recent advances in image caption generation and optical character recognition (OCR), we present a general-purpose, deep learning-based system to decompile an image into presentational markup. While this task is a well-studied problem in OCR, our method takes an inherently different, data-driven approach. Our model does not require any knowledge of the underlying markup language, and is simply trained end-to-end on real-world example data. The model employs a convolutional network for text and layout recognition in tandem with an attention-based neural machine translation system. To train and evaluate the model, we introduce a new dataset of real-world rendered mathematical expressions paired with LaTeX markup, as well as a synthetic dataset of web pages paired with HTML snippets. Experimental results show that the system is surprisingly effective at generating accurate markup for both datasets. While a standard domain-specific LaTeX OCR system achieves around 25% accuracy, our model reproduces the exact rendered image on 75% of examples. 

We present recursive recurrent neural networks with attention modeling (R2AM) for lexicon-free optical character recognition in natural scene images. The primary advantages of the proposed method are: (1) use of recursive convolutional neural networks (CNNs), which allow for parametrically efficient and effective image feature extraction; (2) an implicitly learned character-level language model, embodied in a recurrent neural network which avoids the need to use N-grams; and (3) the use of a soft-attention mechanism, allowing the model to selectively exploit image features in a coordinated way, and allowing for end-to-end training within a standard backpropagation framework. We validate our method with state-of-the-art performance on challenging benchmark datasets: Street View Text, IIIT5k, ICDAR and Synth90k.

Clustering is central to many data-driven application domains and has been studied extensively in terms of distance functions and grouping algorithms. Relatively little work has focused on learning representations for clustering. In this paper, we propose Deep Embedded Clustering (DEC), a method that simultaneously learns feature representations and cluster assignments using deep neural networks. DEC learns a mapping from the data space to a lower-dimensional feature space in which it iteratively optimizes a clustering objective. Our experimental evaluations on image and text corpora show significant improvement over state-of-the-art methods

In recent years, recognition of text from natural scene image and video frame has got increased attention among the researchers due to its various complexities and challenges. Because of low resolution, blurring effect, complex background, different fonts, color and variant alignment of text within images and video frames, etc., text recognition in such scenario is difficult. Most of the current approaches usually apply a binarization algorithm to convert them into binary images and next OCR is applied to get the recognition result. In this paper, we present a novel approach based on color channel selection for text recognition from scene images and video frames. In the approach, at first, a color channel is automatically selected and then selected color channel is considered for text recognition. Our text recognition framework is based on Hidden Markov Model (HMM) which uses Pyramidal Histogram of Oriented Gradient features extracted from selected color channel. From each sliding window of a color channel our color-channel selection approach analyzes the image properties from the sliding window and then a multi-label Support Vector Machine (SVM) classifier is applied to select the color channel that will provide the best recognition results in the sliding window. This color channel selection for each sliding window has been found to be more fruitful than considering a single color channel for the whole word image. Five different features have been analyzed for multi-label SVM based color channel selection where wavelet transform based feature outperforms others. Our framework has been tested on different publicly available scene/video text image datasets. For Devanagari script, we collected our own data dataset. The performances obtained from experimental results are encouraging and show the advantage of the proposed method.

Recently, scene text detection has become an active research topic in computer vision and document analysis, because of its great importance and significant challenge. However, vast majority of the existing methods detect text within local regions, typically through extracting character, word or line level candidates followed by candidate aggregation and false positive elimination, which potentially exclude the effect of wide-scope and long-range contextual cues in the scene. To take full advantage of the rich information available in the whole natural image, we propose to localize text in a holistic manner, by casting scene text detection as a semantic segmentation problem. The proposed algorithm directly runs on full images and produces global, pixel-wise prediction maps, in which detections are subsequently formed. To better make use of the properties of text, three types of information regarding text region, individual characters and their relationship are estimated, with a single Fully Convolutional Network (FCN) model. With such predictions of text properties, the proposed algorithm can simultaneously handle horizontal, multi-oriented and curved text in real-world natural images. The experiments on standard benchmarks, including ICDAR 2013, ICDAR 2015 and MSRA-TD500, demonstrate that the proposed algorithm substantially outperforms previous state-of-the-art approaches. Moreover, we report the first baseline result on the recently-released, large-scale dataset COCO-Text.

Blogs

特征描述的完整过程 http://dataunion.org/wp-content/uploads/2015/05/640.webp_2.jpg

Presentations

Projects

Commercial products

作者:chenqin
链接:https://www.zhihu.com/question/19593313/answer/18795396
来源:知乎
著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。

1,识别率极高。我使用过现在的答案总结里提到的所有软件,但遇到下面这样的表格,除了ABBYY还能保持95%以上的识别率之外(包括秦皇岛三个字),其他所有的软件全部歇菜,数字认错也就罢了,中文也认不出。血泪的教训。
![](https://pic3.zhimg.com/a1b8009516c105556d2a2df319c72d72_b.jpg)
2,自由度高。可以在同一页面手动划分不同的区块,每一个区块也可以分别设置表格或文字;简体繁体英文数字。而此时大部分软件还只能对一个页面设置一种识别方案,要么表格,要么文字。
3,批量操作方便。对于版式雷同的年鉴,将一页的版式设计好,便可以应用到其他页,省去大量重复操作。
4,可以保持原有表格格式,省去二次编辑。跨页识别表格时,选择“识别为EXCEL”,ABBYY可以将表格连在一起,产出的是一整个excel文件,分析起来就方便多了。
5,包括梯形校正,歪斜校正之类的许多图片校正方式,即使扫描得歪了,或者因为书本太厚而导致靠近书脊的部分文字扭曲,都可以校正回来。
Convert scanned images of documents into rich text with advanced Deep Learning OCR APIs. Free forever plans available.
  • IRIS
 真正能把中文OCR做得比较专业的,一共也没几家,国内2家,国外2家。国内是文通和汉王,国外是ABBYY和IRIS(台湾原来有2家丹青和蒙恬,这两年没什么动静了)。像大家提到的紫光OCR、CAJViewer、MS Office、清华OCR、包括慧视小灵鼠,这些都是文通的产品或者使用文通的识别引擎,尚书则是汉王的产品,和中晶扫描仪捆绑销售的。这两家的中文识别率都是非常不错的。而国外的2家,主要特点是西方语言的识别率很好,而且支持多种西欧语言,产品化程度也很高,不过中文方面速度和识别率还是有差距的,当然这两年人家也是在不断进步。Google的开源项目,至少在中文方面,和这些家相比,各项性能指标水平差距还蛮大的呢。 

作者:张岩
链接:https://www.zhihu.com/question/19593313/answer/14199596
来源:知乎
著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。
https://github.com/cisocrgroup
目前看到最棒的免费的API  当然也提供商业版

OCR Databases

OTHERS

Discussion and Feedback

欢迎扫码加入 参与讨论分享 过期请添加个人微信 edwin_whs

Stargazers over time

Stargazers over time

More Repositories

1

awesome-microservice

A curated list of Microservice resources
485
star
2

healthcaredatastandard

healthcare data standard in China
373
star
3

resume-parse-evaluation

Evaluate existing engine of resume parse for Chinese 对各种简历解析工具的测评
HTML
139
star
4

youtube-auto-upload

Schedule and Publish contents erverywhere.Bulk auto video upload and Scheduling & Publishing Effortless for You & Your Entire Team. batch headless upload all major social networks using this ultimate social media scheduler. Fret less, save time, and generate more leads!
HTML
88
star
5

fhir-cn

FHIR中文版 the Chinese translation of FHIR
71
star
6

awesome-web-data-extractor

A curated list of promising Web Data Extractors resources
21
star
7

awesome-ipa

A curated list of awesome Intellegient RPA Robotic Process Automation resources.
20
star
8

healthdata

一些数据 卫生统计年鉴等等
17
star
9

wanghaisheng.github.io

我的博客
HTML
17
star
10

awesome-health-data-anonymity

医疗数据的匿名化研究
11
star
11

OHDSI-Research

对OHDSI的研究
HTML
11
star
12

clinical-decision-support-book

Survey of the State of the Art in structural clinical knowledge
CSS
9
star
13

ocr-arxiv-daily

Python
9
star
14

best-practices-of-api-creation-for-hit

本文主要介绍的是 ` API automatic building and creation 云原生架构下接口自动化构建在医疗信息化行业的应用实战`
PLpgSQL
9
star
15

autovideopublish

automatically upload 100 videos to youtube
Python
8
star
16

cdisc-standard

A collection of CDISC related standards in English and Chinese.
HTML
7
star
17

awesome-http-api

A curated list of http API or Restful API design related material
5
star
18

build-his-by-yourself

how to build all kinds of hospital information systems from scratch by your own staff
5
star
19

old-clinical-decision-support

Deprecated projects and material
Objective-C++
5
star
20

fhir-in-action

example and tutorial for fhir spec
5
star
21

ocr-paper-collector

Python
4
star
22

get-tiktok-user-video-list

scrape tiktok/douyin video list from specific user or keyword
Python
4
star
23

HealthKit

ios 8 HealthKit exampless
Objective-C
4
star
24

tiktok-trending-api-data-archive

Shell
3
star
25

awesome-dsl

A curated list of DSL resources — Edit
3
star
26

tinnitus-dtx

耳鸣的数字疗法
3
star
27

social-media-monitor-weekly-report

根据设定的品牌、关键词获取主流社交媒体上的一周动态 包括但不限于视频、发布视频的帐号 、评论数量、评论内容、评论热度,和简单的统计分析
Python
3
star
28

youtube-automation-toolkit

DIGITAL Command Language
3
star
29

diet-treatment-tcm

中医食疗
2
star
30

fhirplace

Open-source FHIR server
Java
2
star
31

hit-best-practices

基于微信群“HIT最有价值专家群”的精华内容整理而成,希望能给大家带来帮助
2
star
32

Principles-of-Health-Interoperability-HL7-and-SNOMED

试译稿
2
star
33

xiaogaojie

video index and comments collection b站评论 youtube评论收集
2
star
34

live-streaming-transcript-dataset

收集直播话术
Python
2
star
35

textile-defect-detection-ai

textile defect detection using ai
2
star
36

hospital-in-china

国内所有医疗机构的基本信息 省份 城市 等级 特色 说明
2
star
37

ace-attorney-story-video-auto-generation

Jupyter Notebook
2
star
38

datacenter4hospital

2
star
39

ocr-baby

📄 The official documentation site for OCR
JavaScript
2
star
40

3d-vision-paper-daily

Python
2
star
41

blockchain-in-healthcare

Patientory PokitDok
2
star
42

Artificial-Intelligence-and-National-Security

2017年7月,美国哈佛大学肯尼迪学院贝尔福科学与国际事务中心发布了题为《人工智能与国家安全的报告》,分析了人工智能(AI)技术对国家安全的潜在影响,并提出了3点目标和11个发展建议。报告全文132页。
2
star
43

supplements-tell

1
star
44

Cognitive-Behavioral-Therapy

认知行为疗法
1
star
45

shopify-order-alert-wechaty

TypeScript
1
star
46

Kokichi-Sugihara

杉原厚吉的collection
1
star
47

lion-digital-downloads

Digital downloads store using NextJS and Stripe and Supabase.
JavaScript
1
star
48

YouTube-Podcast

Template to transform your youtube channel into a Podcast hosted on Anchor.fm
1
star
49

clone-tools-in-top-1m-domain

TypeScript
1
star
50

azure_func_pywebio_wsgi_starter

Python
1
star
51

awesome-cp

有关小儿脑瘫的一切
1
star
52

fhirbase

Relational Storage for FHIR
PLpgSQL
1
star
53

healthcare-solution-operation-system

cloud-native-healthcare-solution
1
star
54

handbook-of-zhichuang-treatment

收集文献 秘方
1
star
55

awesome-healthcare-interoperability

Healthcare Interoperability
1
star
56

search-in-app

豌豆荚应用内搜索
1
star
57

Scientific-Advertising

translation of the book <Scientific Advertising> in Chinese
1
star
58

lawn-mowing-video-website

JavaScript
1
star
59

ridiculous-web5

1
star
60

himss-research

对HIMSS的研究 http://wanghaisheng.github.io/himss-research/
HTML
1
star
61

ComputationalHealthcare

Healthcare data processing and analysis library powering Computational Healthcare.
Python
1
star
62

make-rick-and-morty-style-video

HTML
1
star
63

make-xianjian-video

HTML
1
star
64

imageschi

Shell
1
star
65

astro-python-gui

how to build astro and python gui app
Astro
1
star
66

all-cities-around-the-world-with-same-latitude

和自己家乡处在同一纬度的城市有哪些呢
1
star
67

ai-chip-paper-and-showcase

以蜡笔小芯为虚拟人形象,将论文、案例以视频形式在b站、douyin传播
1
star
68

tcm-master

国医大师
1
star
69

awesome-hipaa

A curated list of HIPAA related material
1
star
70

Gemini

同款检测临时服务
Python
1
star
71

cda-in-action

cda R2 clinical document architecture in action
1
star
72

url2video-pdf

Python
1
star
73

newborn-and-healthcare

这里有新生儿护理保健的一切
1
star
74

awesome-wearable-device

curated list of resources about wearable device
1
star
75

chs-drg

整理过后的chs-drgs
1
star
76

building-great-team

learn from master
1
star
77

Alopecia-solution

脱发
1
star
78

awesome-walter-schloss

https://www.valuewalk.com/walter-schloss/
1
star
79

track-2b-customer-true-requirements

政绩 业绩 成绩
1
star
80

from-data-to-insight

insight means Tell a story to answer a question with your data.
1
star
81

tiktoka-studio-relivator-demo

TypeScript
1
star
82

Sync-YouTube-Podcast

Periodicly check YouTube RSS for new entries
Python
1
star
83

A-Survey-on-Wearable-Technology-History-State-of-the-Art-and-Current-Challenges

A Survey on Wearable Technology: History, State-of-the-Art and Current Challenges
1
star
84

copycat-account-detect

detect if there is similar social media account or your top video post by others
1
star
85

tiktoka-studio-gui

Python
1
star
86

tiktoka-studio-app-tauri-nextjs

TypeScript
1
star
87

run-a-profitable-hospital

医疗服务也是服务行业,如何借鉴其他行业的经验,形成一套 推高业务收入增长,降低成本和提升用户满意度的方法论
1
star
88

app-review-csv-to-webgal-scripts

Python
1
star
89

build-docker-for-serverless-deploy-starter

Shell
1
star
90

worker-kit-email

Develop transactional emails with SvelteKit on CloudFlare Workers
TypeScript
1
star
91

openresty-tutorial

notes through learning openresty
1
star
92

longtail-keywords-expand-tools-GUI

Python
1
star
93

WebGAL_Live_Demo

the demo page of WebGAL
HTML
1
star
94

truth-of-modern-business-model

商业模式-利益相关者的交易结构
1
star
95

common-lib

常用代码和库
Java
1
star
96

brandninja

Svelte
1
star
97

subscription-based-saas

📰 Anime.news is a subscription-based news application made with Next.js using Typescript, Prismic and FaunaDB.
TypeScript
1
star
98

ai-chips-community-growth-strategy

时值国产化芯片如雨后春笋爆发式涌现的时代,借鉴开源软件生态社区构建的思路,整理了自己对于这种软硬结合类产品的生态社区发展的一点点想法
1
star
99

twitch-bot-worker

Cloudflare Worker for Twitch Bots. This worker implements these services, age, fivem and subrecord.
JavaScript
1
star
100

mirth-connect-chinese

mirth connect docs in Chinese 更多有关mirth connect的中文文档 方便大家使用和学习
1
star