• Stars
    star
    8
  • Rank 2,041,642 (Top 42 %)
  • Language
  • License
    Other
  • Created over 4 years ago
  • Updated about 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Gold standard resource for evaluation of Danish word embedding models.

More Repositories

1

cstlemma

Lemmatiser for Danish, Dutch, English, German, Polish, Romanian, Russian and tens of other languages, that uses affix rules (affix: prefix, infix, suffix, circumfix). Rules are obtained by supervised learning from a full form - lemma list.
C++
32
star
2

stucco

An experimental adaptive UI toolkit.
Clojure
31
star
3

xml-hiccup

Convert XML into Hiccup in Clojure and ClojureScript.
Clojure
19
star
4

DanNet

The Danish WordNet as an RDF graph.
Clojure
18
star
5

taggerXML

Modernized version of Eric Brill's Part Of Speech tagger.
C++
17
star
6

tf-idf

A reasonably performant TF-IDF implementation.
Clojure
12
star
7

rescope

Turn documents into UI components.
Clojure
7
star
8

pedestal-sp

Turn a Pedestal web service into a SAML Service Provider.
Clojure
7
star
9

rtfreader

Text segmenter and tokeniser for Danish, English and other languages. Reads an RTF or flat text file and outputs the text, one line per sentence & optionally tokenized.
C++
6
star
10

texton

Text Tonsorium - a toolbox that automatically arranges NLP tools in workflows and enacts them with user's inputs
PHP
5
star
11

Anvil-Facetracker

OpenCV-based Plugin for the Anvil annotation software that tracks faces and creates annotations when velocity or acceleration thresholds are transgressed.
Java
5
star
12

cuphic

Transform or scrape Hiccup with a declarative DSL.
Clojure
4
star
13

glossematics

The life of Louis Hjelmslev.
Clojure
4
star
14

affixtrain

Using supervised learning, create a set of affix rules for use by the CSTlemma lemmatiser.
C++
4
star
15

letterfunc

Functions for upper/lower casing, for testing whether a character is a letter and for conversion between Unicode encodings UTF-8 and UTF-16
C
2
star
16

texton-Java

Web-based workflow management system that computes candidate tool workflows given input file(s) and the user's requirements regarding the output. Afterwards, runs a workflow selected by the user from the list of candidates. Implemented in Bracmat (~75%) and Java (~25%).
Java
2
star
17

danish-semantic-reasoning-benchmark

A Danish semantic reasoning benchmark compiled from lexical semantic resources
1
star
18

qname

A QName record and conversions between QNames, Keywords, and IRI strings.
Clojure
1
star
19

texton-linguistic-resources

Linguistic resources for several of the tools included in the Text Tonsorium
Roff
1
star
20

head_movement_detection

Jupyter notebooks and training data containing manual head movement annotations, speech data and velocity, acceleration and jerk data.
Jupyter Notebook
1
star