• Stars
    star
    63
  • Rank 484,938 (Top 10 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 3 years ago
  • Updated 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

CLIR version of ColBERT

More Repositories

1

golden-horse

Named Entity Recognition for Chinese social media (Weibo). From EMNLP 2015 paper.
Python
534
star
2

turkle

Django-based clone of Amazon's Mechanical Turk service running in your local environment.
Python
142
star
3

PredPatt

PredPatt: Predicate-Argument Extraction from Universal Dependencies
Python
112
star
4

mingpipe

A Chinese name matcher written in Python. Describe in: Nanyun Peng, Mo Yu, Mark Dredze. An Empirical Study of Chinese Name Matching and Applications. Association for Computational Linguistics (ACL) (short paper), 2015.
Python
37
star
5

EventMiner

Event extraction pipeline.
Python
35
star
6

concrete-python

Python modules and scripts for working with Concrete, a data serialization format for NLP
Python
20
star
7

patapsco

Cross language information retrieval pipeline
Python
18
star
8

concrete

Thrift definitions, making HLT data specifications concrete
Thrift
16
star
9

clir-tutorial

SIGIR 2023 tutorial on cross language information retrieval.
Jupyter Notebook
13
star
10

gazetteer-collection

Jupyter Notebook
12
star
11

xvectors

Python
7
star
12

HC4

HLTCOE CLIR Common-Crawl Collection
Python
7
star
13

parma

A Predicate Argument Linker
Scala
7
star
14

parma2

A predicate argument alignment tool
Scala
7
star
15

sandle

Run a large language modeling SANDbox in your Local Environment
Python
7
star
16

quicklime

Visualization tool for Concrete, a data serialization format for NLP
JavaScript
7
star
17

concrete-java

Java library for Concrete, a data serialization format for NLP
Java
6
star
18

concrete-deprecated

OLD project for Concrete-thrift
Java
5
star
19

prototurk

Simple server for rapidly prototyping Mechanical Turk interfaces
Python
5
star
20

cadet

CADET is a system for rapid discovery, annotation, and extraction on text
JavaScript
4
star
21

concrete-js

JavaScript library for working with Concrete, a data serialization format for NLP
JavaScript
3
star
22

docker-nltk

A very simple example pipeline for named entity recognition using off-the-shelf NLTK.
Python
3
star
23

vivisect

A framework for exploring the internals of DNN models
Python
3
star
24

vaporengine

VaporEngine
JavaScript
3
star
25

concrete-stanford

Concrete-Stanford: Wraps Stanford NLP with utilities to fit it into a concrete compliant workflow
Java
3
star
26

concrete-gigaword

Tools for mapping English Gigaword v5 to Concrete
Java
2
star
27

tift

Tift is for tokenization
Java
2
star
28

peer_measure

Implementation of the measure Probability of Equal Expected Rank
Python
2
star
29

tasa

TASA - Translation And Structural Alignment
JavaScript
2
star
30

fetch-wikiqa-corpus

Concrete FetchCommunicationService bundled with "WikiQA corpus"
1
star
31

probe

Scala
1
star
32

stretcher

Concrete file server
Java
1
star
33

concrete-stanford-deprecated2

Concrete-Stanford: Wraps Stanford NLP with utilities to fit it into a concrete compliant workflow
Java
1
star
34

annotated-nyt

Java wrappers and utilities for reading the Annotated NYT corpus
Java
1
star
35

lid

Python
1
star
36

simple-search-demo

JavaScript
1
star
37

concrete-ontology

Concrete ontology
Java
1
star
38

concrete-agiga

Tools to map between concrete and agiga representations
Java
1
star
39

styleguides

HLTCOE recommended style guidelines for importing into IDEs
1
star
40

BLADE

Python
1
star
41

rebar

Java
1
star
42

cmn-renmin-ocr-ner-dataset

NER annotations of the Chinese Newspaper Renmin
Python
1
star
43

goncrete

golang bindings for concrete
Go
1
star
44

cadet-search-lucene

A search implementation for Concrete
Java
1
star