• This repository has been archived on 19/Jun/2021
  • Stars
    star
    27
  • Rank 905,827 (Top 18 %)
  • Language
    C
  • Created about 8 years ago
  • Updated almost 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

pdf2xml convertor based on Xpdf library - modified version

More Repositories

1

grobid

A machine learning software for extracting information from scholarly documents
Java
3,496
star
2

delft

a Deep Learning Framework for Text https://delft.readthedocs.io/
Python
388
star
3

grobid_client_python

Python client for GROBID Web services
Python
274
star
4

entity-fishing

A machine learning tool for fishing entities
Java
239
star
5

pdfalto

PDF to XML ALTO file converter
C
214
star
6

biblio-glutton

A high performance bibliographic information service: https://biblio-glutton.readthedocs.io
Java
117
star
7

article_dataset_builder

Open Access PDF harvester, metadata aggregator and full-text ingester
Python
53
star
8

grobid-ner

A Named-Entity Recogniser based on Grobid.
Java
48
star
9

Pub2TEI

Service for converting and enhancing heterogeneous publisher XML formats into TEI
XSLT
43
star
10

biblio_glutton_harvester

Open Access PDF harvester
Python
34
star
11

grobid-example

Some examples of usage of Grobid in a third party java project.
Java
18
star
12

grisp

Knowledge Base stuff
Java
16
star
13

grobid-client-node

Simple node.js client for GROBID REST services
JavaScript
14
star
14

xpdf-4.00

C++
13
star
15

datastet

Finding mentions and citations to named and implicit research datasets from within the academic literature
JavaScript
13
star
16

biblio-glutton-extension

A browser extension providing Open Access bibliographical services
JavaScript
11
star
17

grobid-astro

A machine learning software for extracting astronomical entities from scholarly documents
JavaScript
11
star
18

kish

Keeping It Simple is Hard
JavaScript
7
star
19

arxiv_harvester

Poor man's simple harvester for arXiv resources
Python
6
star
20

grobid-client-java

Simple Java client for GROBID REST services
Java
5
star
21

xpdf-4.03

patched xpdf lib for pdfalto
C++
2
star
22

anHALytics

Analytic platform for the HAL research archive
JavaScript
2
star
23

grobid-bio

Basic grobid-based bio-entity tagger using BioNLP/NLPBA 2004 dataset
Java
2
star
24

dataset_recognition_resources

Python
2
star
25

softcite-api

Web API for the Softcite Knowledge-Base
Python
2
star
26

softdata_mentions_client

Python client for software and dataset mention recognizer in scholarly publications, using the Softcite and Datastet services
Python
2
star
27

xpdf-3.04

C++
1
star