• Stars
    star
    5
  • Rank 2,861,937 (Top 57 %)
  • Language
    C++
  • License
    GNU General Publi...
  • Created almost 10 years ago
  • Updated almost 10 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Efficient substring searches on text corpora using a compressed index

More Repositories

1

vim-256noir

A dark 256-color colorscheme for vim
Vim Script
161
star
2

roaringbitmap

Roaring Bitmap in Cython
Cython
79
star
3

disco-dop

Discontinuous Data-Oriented Parsing
Python
46
star
4

dutchcoref

Dutch coreference resolution & dialogue analysis using deterministic rules
Python
21
star
5

seekaywhy

A probabilistic CKY parser for PCFGs
Python
19
star
6

eodop

Data-Oriented Parsing implementation for NLTK applied to Esperanto morphology and syntax
TeX
10
star
7

pdfbrowse

A simple AJAX PDF viewer and browser
Python
8
star
8

subsequences

Extract longest common subsequences from texts
Python
7
star
9

codingforhumanities

Coding for Humanities course materials
Jupyter Notebook
6
star
10

activedop

A treebank annotation tool based on a statistical parser that is re-trained during annotation
Python
4
star
11

litvecspace

Accompanying code for the paper "Vector space explorations of literary language"
Jupyter Notebook
4
star
12

dop-transformations

Transformations with Data-Oriented Parsing
Python
3
star
13

tgrep2

Fork of tgrep2
C
3
star
14

literariness

Code for the paper "A data-oriented model of literary language"
Python
2
star
15

udstyle

Compute complexity metrics from Universal Dependencies
Python
2
star
16

authident

Authorship attribution with syntactic fragments
Python
2
star
17

ethnlpgender

Code for paper "Bias and Fairness in Authorial Gender Attribution"
Jupyter Notebook
2
star
18

kinglit

Code for the paper "Stylometric Literariness Classification: the Case of Stephen King"
Jupyter Notebook
2
star
19

qrinductionsplit

Splitting qualitative process models as produced by model induction using a behavior graph
TeX
2
star
20

diction

diction / style UNIX utitlities
C
2
star
21

litcliches

Code for the paper "Cliche expressions in literary and genre novels"
Jupyter Notebook
1
star
22

litquest

Code and data for LaTeCH 2020 paper on a literariness questionnaire
Jupyter Notebook
1
star
23

dutchlitpreproc

Preprocessing pipeline for Dutch literature
Python
1
star
24

openboek

The OpenBoek corpus
1
star
25

fictiongenres

Code and data for the chapter "Computational Methods for the Analysis of Fiction Genres"
Jupyter Notebook
1
star