• Stars
    star
    6
  • Rank 2,468,220 (Top 50 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 2 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Poor man's simple harvester for arXiv resources

More Repositories

1

grobid

A machine learning software for extracting information from scholarly documents
Java
3,109
star
2

delft

a Deep Learning Framework for Text
Python
384
star
3

grobid_client_python

Python client for GROBID Web services
Python
248
star
4

entity-fishing

A machine learning tool for fishing entities
Java
234
star
5

pdfalto

PDF to XML ALTO file converter
C
195
star
6

biblio-glutton

A high performance bibliographic information service
Java
112
star
7

article_dataset_builder

Open Access PDF harvester, metadata aggregator and full-text ingester
Python
49
star
8

grobid-ner

A Named-Entity Recogniser based on Grobid.
Java
48
star
9

Pub2TEI

Service for converting and enhancing heterogeneous publisher XML formats into TEI
XSLT
37
star
10

biblio_glutton_harvester

Open Access PDF harvester
Python
33
star
11

pdf2xml

pdf2xml convertor based on Xpdf library - modified version
C
27
star
12

grobid-example

Some examples of usage of Grobid in a third party java project.
Java
17
star
13

grisp

Knowledge Base stuff
Java
16
star
14

grobid-client-node

Simple node.js client for GROBID REST services
JavaScript
14
star
15

xpdf-4.00

C++
13
star
16

datastet

Finding mentions and citations to named and implicit research datasets from within the academic literature
JavaScript
13
star
17

biblio-glutton-extension

A browser extension providing Open Access bibliographical services
JavaScript
11
star
18

grobid-astro

A machine learning software for extracting astronomical entities from scholarly documents
JavaScript
9
star
19

kish

Keeping It Simple is Hard
JavaScript
7
star
20

grobid-client-java

Simple Java client for GROBID REST services
Java
5
star
21

xpdf-4.03

patched xpdf lib for pdfalto
C++
2
star
22

anHALytics

Analytic platform for the HAL research archive
JavaScript
2
star
23

grobid-bio

Basic grobid-based bio-entity tagger using BioNLP/NLPBA 2004 dataset
Java
2
star
24

dataset_recognition_resources

Python
2
star
25

softcite-api

Web API for the Softcite Knowledge-Base
Python
2
star
26

softdata_mentions_client

Python client for software and dataset mention recognizer in scholarly publications, using the Softcite and Datastet services
Python
2
star
27

xpdf-3.04

C++
1
star