There are no reviews yet. Be the first to send feedback to the community and the maintainers!
pynlpl
PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).vocage
A minimalistic spaced-repetion vocabulary trainer (flashcards) for the terminalclam
Quickly turn command-line applications into RESTful webservices with a web-application front-end. You provide a specification of your command line application, its input, output and parameters, and CLAM wraps around your application to form a fully fledged RESTful webservice.colibri-core
Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.flat
FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.github.io/folia), a rich XML-based format for linguistic annotation. Flat allows users to view annotated FoLiA documents and enrich these documents with new annotations, a wide variety of linguistic annotation types is supported through the FoLiA paradigm.LaMachine
LaMachine - A software distribution of our in-house as well as some 3rd party NLP software - Virtual Machine, Docker, or local compilation/installation scriptfolia
FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for processing FoLiA is implemented as part of PyNLPl, this contains higher-level tools that use the library as well as the full documentation, validation schemas, and set definitionspython-frog
Python bindings to the dutch NLP tool Frog (pos tagger, lemmatiser, NER tagger, morphological analysis, shallow parser, dependency parser)analiticcl
an approximate string matching or fuzzy-matching system for spelling correction, normalisation or post-OCR correctionpython-ucto
This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet it is not always as trivial a task as it appears to be. This binding makes the power of the ucto tokeniser available to Python. Ucto itself is regular-expression based, extensible, and advanced tokeniser written in C++ (http://ilk.uvt.nl/ucto).codemetapy
A Python package for generating and working with codemetagecco
Generic Environment for Context-Aware Correction of Orthographyhomeassistant-config
My elaborate home automation configuration + scriptsdotfiles
My dotfilesdeepfrog
An NLP-suite powered by deep learninghanzigrid
Hanzi grids for studying mandarin chinese (tool & output data)foliapy
An extensive Python library for dealing with FoLiA (Format for Linguistic Annotation) documents, a rich XML-based format for linguistic annotation finding application in Natural Language Processing (NLP). This library was formerly part of PyNLPl.procmapgen
A small toy project written in Rust: procedural generation of various kinds of grid-based maps.python-timbl
python-timbl, originally developed by Sander Canisius, is a Python extension module wrapping the full TiMBL C++ programming interface. With this module, all functionality exposed through the C++ interface is also available to Python scripts. Being able to access the API from Python greatly facilitates prototyping TiMBL-based applications.spacy2folia
Use spaCy for NLP and output to the FoLiA XML format.foliatools
A number of command-line tools for working with FoLiA (Format for Linguistic Annotation). Includes validators, converters, visualisers, and more.pbmbmt
Phrase-based Memory-based Machine Translationunilangforum
UniLang Language Community - Forumcolibri
THIS PROJECT IS BEING RENDERED OBSOLETE BY NEWER VERSIONS colibri-core and colibri-mt !!valkuil-gecco
Nederlandse Spellingscontrole / Dutch spelling correction system - powered by Gecconederlab-pipeline
Linguistic enrichment pipeline for historical dutch, as used in the Nederlab projectanavec
Proof-of-concept spelling correction/normalisation system based on anagram vectorscodemeta-harvester
Harvest and aggregate codemeta/schema.org software metadata from source repositories and service endpoints, automatically converting from known metadata schemes in the processfoliadocserve
FoLiA Document Server - HTTP webservice backend for serving and annotating FoLiA documents using the FoLiA Query Language (FQL). Used by FLAT.piereling
Piereling is a webservice and web-application to convert between a variety of document formats, mostly from and to FoLiA XML. It is intended for NLP pipelines.lingua-cli
Very small simple command-line interface for language detection using lingua-rscolibri-mt
A Machine Translation framework that wraps around the Moses Decoder and enables k-NN classifier techniques to be used for modelling source-side-contextbabelente
BabelEnte: Entity Extractor and Translator using BabelFy and Babelnet.orglabirinto
A web front-end portal for a virtual laboratory of NLP toolsclamservices
A collection of CLAM webservices for various of our Natural Language Processing toolsfolia-rust
FoLiA library for rust (alpha)codemeta-server
Server for codemeta, in memory triple store, SPARQL endpoint and simple web-based visualisation for end-usersesdiff
Generates a shortest edit script (Myers' diff algorithm) to indicate how to get from the strings in column A to the strings in column B. Also provides the edit distance (levenshtein).alpino_clam_webservice
A CLAM-powered webservice for Alpino, a dependency parser for Dutchvocadata
Data for vocabulary learningparseme-support
FoLiA & FLAT support for PARSEMEspreek2schrijf
Scripts voor Spreek2Schrijf, een project met de Tweede Kamersvkbd
my fork of suckless' simple virtual keyboard: https://tools.suckless.org/x/svkbd/sxmo-docs
my fork of https://git.sr.ht/~mil/sxmo-docsaNtiLoPe
A collection of NLP pipelines powered by Nextflowsxmo-utils
my fork of https://git.sr.ht/~mil/sxmo-utils/wrexp
Experiment Wrapper - A framework for launching and keeping track of experiments. Wrexp takes care of storing all stdout/stderr logs and mails you when experiments are completed.wikiente
A named entity recogniser and linker based on DBPedia Spotlight, with support for the FoLiA formatcolibri-apps
Contains NLP applications using Colibri Core, suited for end-users. The applications are generally web-based.wsd2
colloquery
Web application for searching for phrases/collocations/synonyms in phrase translation tableslexmatch
Simple lexicon matcher against a textcolibri-utils
NLP utilities that rely on Colibri Core: currently only language identificationnlpsandbox
Natural Language Processing Sandbox - An experimental playground for all kinds of NLP tasksssam
split sampler: split your data into multiple sets (e.g. train/test/development)LaMachine-docker-test
Meta repository for docker testing of LaMachine on Travis-CIdwm
my patched fork of dwmunilang_ulr
Collection of open language resources from UniLang; containing mostly phrasebooks and storiesoersetter-models
Models for Oersetter, a Frisian<->Dutch Machine Translation systemchira
Chinese Reading Assistant, pop-up translations for Linuxsxmo-svkbd
My fork of https://git.sr.ht/~mil/sxmo-svkbdvalkuil
Valkuil.net is een automatische spellingcorrector voor het Nederlands die zowel gewone typefouten als grammaticale fouten en verwarringen tussen bestaande woorden opspoort.aur-packages
Arch User Repository packages I maintaincwrap
Small C wrapper to turn a C function into a very simple webservicecampyon
Campyon is both a command-line tool as well as Python library for viewing and manipulating columned data files. It supports various filters, statistics, visualisations, and plotting.vocavue
A vocabulary trainer with a viewlst-chat
homepage
My websitehyphertool
Command-line tool for syllabification and hyphenisation for multiple languageslamastats
Generates statistical reports on the usage of our software and webservicescharfreq
Very simply command-line tool that counts (unicode) character frequency from standard inputcolibrita
Colibrita is a proof-of-concept translation assistance system, translating L1 fragments in an L2 context, using machine learning and statistical machine translation techniquesLove Open Source and this site? Check out how you can help us