National Library of the Netherlands / Research (@KBNLresearch)

Top repositories

1

ochre

Toolbox for OCR post-correction
Common Workflow Language
122
star
2

europeananp-ner

Named Entities Recognition Annotator Tool for Europeana Newspapers
Java
60
star
3

isolyzer

Verify size of ISO 9660 image against Volume Descriptor fields
Python
43
star
4

keyword-generator

Command-line tool to extract a ranked list of relevant keywords from a corpus with the option of using either topic modeling or tf-idf scores.
Python
41
star
5

iromlab

Loader software for automated imaging of optical media with Nimbie disc robot
HTML
31
star
6

tapeimgr

Simple tape imaging and extraction tool
Python
27
star
7

KB-python-API

Python API for KB data-services
Python
18
star
8

europeananp-dbpedia-disambiguation

Python
17
star
9

forensicImagingResources

Shell
16
star
10

alto-editor

Browser based post correction tool for Alto XML files
JavaScript
12
star
11

dac

Entity linker for the newspaper collection of the National Library of the Netherlands. Links named entity mentions to DBpedia descriptions using either a binary SVM classifier or a neural net.
Python
12
star
12

frame-generator

Tool for extracting topics, keywords and their collocates from a Dutch corpus. Includes and extends the functionality of the Keyword Generator.
Python
9
star
13

diskimgr

Simple workflow tool for imaging block devices
Python
9
star
14

dac-web

Web interface to manually annotate named entity mentions in newspaper articles with the correct DBpedia link(s), if any. Produces labeled data sets for training and evaluating the DAC Entity Linker.
Python
9
star
15

scansion-generator

Command-line tool that generates a scansion for modern Dutch metric poetry.
Python
8
star
16

genre-classifier

Genre classifier for Dutch historical newspaper articles.
Python
7
star
17

omimgr

Simple workflow tool for imaging optical media
Python
7
star
18

omSipCreator

Create ingest-ready SIPs from batches of optical media images
Python
7
star
19

multiNER

Multiple NER-tool's combined in one output. Incovating mutliple NER-engine's in parallel.
Python
6
star
20

siamese

Advertisement search interface based on image similarity.
Python
6
star
21

openjpeg-decoder-service

A java based jp2 decoder service.
Java
5
star
22

jp2totiff

Shell
5
star
23

textExtractDemo

Text extraction demo
Python
5
star
24

xml-workshop

Automatically extract text, layout and metadata information from XML-files of OCR-ed historical texts
Jupyter Notebook
4
star
25

ocropus-wrapper

Simple Python wrapper for ocropus command line invocation
Python
4
star
26

Ebook-Fixer

JavaScript
3
star
27

DBNL-canonicity

KB RiR project to Collect a corpus of Dutch novels 1800-2000 and Investigate Canonicity
Python
3
star
28

enhance_ocr

Enhance OCR of newspapers archive
Python
3
star
29

ipmlab

Image Portable Media Like A Boss
Python
3
star
30

chatbot-builder-nl

JavaScript
3
star
31

ebooks-qa

Scripts for quality assessment of e-books
Python
3
star
32

jp2view

experimental java jp2 viewer using jni bindings with openjpeg2.0
Java
3
star
33

iromsgl

Single-disc version of Iromlab
HTML
2
star
34

dictionary-viewer

View the number of newspaper articles per year containing a user-specified minimum number of keywords.
JavaScript
2
star
35

detectDamagedAudio

Tests on how to detect damaged WAV files
HTML
2
star
36

spatio-temporal-topics

Python
2
star
37

genre-classifier-gui

Web interface for the genre classifier.
HTML
2
star
38

CHRONIC

Classified Historical Newspaper Images
HTML
2
star
39

frame-generator-gui

Web interface for the Frame Generator.
JavaScript
2
star
40

xs4all-resources

Scripts and documentation related to the xs4all homepage rescue efforts
Python
2
star
41

cdtestcorpus

Scripts and data for creating test CDs using different CD layouts
HTML
1
star
42

tikadetect-tree

Bash script that performs file format identification on all files in a directory tree using Apache Tika
Shell
1
star
43

gado2

Dutch/Indonesian BERT-NER setup.
C++
1
star
44

topics

Predict news article topics and DBpedia description topics and type.
Jupyter Notebook
1
star
45

summerSchoolPDFEpub

Achtergrondinformatie en verdiepende materialen bij het KB Summerschool onderdeel PDF en EPUB
1
star
46

ProtoCST

A prototype webapplication for corpus selection, inspection and export
HTML
1
star
47

mvds

Monitor van de stad
Python
1
star
48

IwI22_ARTIST

This repository contains the Jupyter Notebooks and other information as created during ICT With Industry 2022
Jupyter Notebook
1
star
49

bb_recog

Book back recognition
1
star
50

magic-file-java-6

Experimental Java binding for libmagic file characterisation
Java
1
star
51

zenodoReports

Fetch metadata and generate reports for a Zenodo community.
Python
1
star
52

hack4europe

Javascript based portal for searching Europeana collections and creating enrichments on the metadata.
JavaScript
1
star
53

Annif_data_exp

Automatic subject assignment for KB ebooks using Annif.
Jupyter Notebook
1
star
54

Hackalod

This is the github repo of the Koninklijke Bibliotheek (KB) created for the Hackalod 2021 (https://hackalod.com/)
PHP
1
star
55

dbpedia-indexer

Collection of Python scripts to build a Solr index from selected Dutch and English DBpedia dumps.
Python
1
star
56

intro-kb-apis

Materials for the RUG workshop on the KB search and harvest APIs.
1
star
57

EntangledHistories

Processing of Transkribus output using xslt and running it through Annif
Jupyter Notebook
1
star
58

Demosaurus

Demo web application that supports author attribution (thesaureren) and topic attribution (subject indexing). Annif is used for the latter.
Jupyter Notebook
1
star