• Stars
    star
    3
  • Rank 3,943,079 (Top 79 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created almost 2 years ago
  • Updated about 1 month ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Create datasets from WordPress sites for research or archiving

More Repositories

1

ultimate-sitemap-parser

Ultimate Website Sitemap Parser
Python
178
star
2

gate-core

The GATE Embedded core API and GATE Developer application
Java
75
star
3

broad_twitter_corpus

The Broad Twitter Corpus, an NER dataset in English stratified for time, location, social media genre, socioeconomic factors (COLING 2016)
Jupyter Notebook
64
star
4

python-gatenlp

Python text processing, pattern matching, and NLP framework
Jupyter Notebook
61
star
5

gateplugin-LearningFramework

A plugin for the GATE language technology framework for training and using machine learning models. Currently supports Mallet (MaxEnt, NaiveBayes, CRF and others), LibSVM, Scikit-Learn, Weka, and DNNs through Pytorch and Keras.
Java
26
star
6

semeval2019-hyperpartisan-bertha-von-suttner

SemEval 2019 Hyperpartisan News Detection - team Bertha von Suttner contribution
Python
22
star
7

gateplugin-Python

Python integration for the GATE framework
Java
20
star
8

Bio-YODIE

Bio-YODIE is GATE's biomedical named entity linking pipeline.
Java
17
star
9

mimir

Multi-paradigm Information Management Index and Repository
Java
10
star
10

cluster-embeddings

Simple script to create clusters from embeddings in word2vec format
Python
10
star
11

CANTM

Python
8
star
12

jaspell

Fork of http://jaspell.sourceforge.net to allow control over the character encoding used for the dictionary files.
Java
6
star
13

gateplugin-Stanford_CoreNLP

GATE wrappers for the Stanford CoreNLP tool set
Java
5
star
14

StanceClassifier

Stance Classifier for the WeVerify project
Python
5
star
15

gate-teamware

A web application for collaborative document annotation.
Python
4
star
16

gate-lf-python-data

Python library for handling (dense) training/application data produced by the Learning Framework
Python
4
star
17

gateapplication-French

Processing pipeline for French, performing Tokenisation, POS Tagging and NER
Shell
3
star
18

emina

Emergent Informativeness and Actionability
Python
3
star
19

gcp

GATE Cloud Paralleliser
Java
3
star
20

gateapplication-German

Processing pipeline for German, performing Tokenisation, POS Tagging and NER
Shell
3
star
21

gate-cloud-python-example

example of using the GATE Cloud on-line API
Python
3
star
22

gateplugin-dict-lemmatizer

A plugin for the GATE language technology framework for finding lemmata of words.
Java
3
star
23

gateplugin-Tagger_SyntaxNet

A GATE plugin for using a Google Tensorflow Serving SyntaxNet server
Java
2
star
24

gateplugin-JdbcLookup

A plugin for the GATE language technology framework for adding and updating annotations from a JDBC table.
Java
2
star
25

Tweet-Network-GEXF-Generator

Tweet Network GEXF Generator
Groovy
2
star
26

gateplugin-Lang_German

German language support for GATE
HTML
2
star
27

corpusconversion-bnc

Tool to convert the British National Corpus to GATE format
Java
2
star
28

dont-waste-single-annotation

2
star
29

gateplugin-Lang_Chinese

Support for processing Chinese documents
Java
2
star
30

gateplugin-MetaMapLite

A GATE plugin wrapping MetaMapLite.
Java
2
star
31

VaxxHesitancy

2
star
32

gateplugin-Tools

A selection of processing resources commonly used to extend ANNIE
Java
2
star
33

bio-yodie-resource-prep

Scripts to prepare the informational resources required by GATE Bio-YODIE.
Scala
2
star
34

gateplugin-Tagger_GoogleNLP

GATE NLP plugin for the Google NLP
Java
2
star
35

gateplugin-ModularPipelines

A plugin for the GATE language technology framework that helps creating modular pipelines and parametrizing them
Java
2
star
36

SurveyKeywordsExtraction

Keywords extraction from survey questions
Python
2
star
37

gatelib-spring

Spring support for use with GATE
Java
2
star
38

gateplugin-JAPE_Plus

An alternative, usually more efficient and faster, JAPE implementation
Java
2
star
39

gateplugin-Gazetteer_Ontology_Based

An ontology based gazetteer for GATE
Java
2
star
40

tweet-rehydrater

Tool to take standoff annotations against a list of Tweets and merge them with the original text from Twitter
Java
2
star
41

CLEF2024_InCrediblAE_Manual_Evaluation_Dataset

Manual evaluation dataset of CheckThat! Lab at CLEF 2024 Task 6: Robustness of Credibility Assessment with Adversarial Examples (InCrediblAE)
2
star
42

gateplugin-Alignment

Java
1
star
43

gateplugin-LIWC

A gate plugin to extract LIWC features
Java
1
star
44

gateplugin-Crowd_Sourcing

GATE plugin to interface with the CrowdFlower crowd sourcing platform
Java
1
star
45

gate-dsl

Write GATE applications in a Groovy DSL.
Groovy
1
star
46

gateplugin-Lang_Danish

Support for processing Danish documents
Java
1
star
47

userguide

The GATE user guide
TeX
1
star
48

gate-lf-keras-json

Keras wrapper for the LearningFramework GATE plugin
Python
1
star
49

gateplugin-Format_Twitter

Document Format plugin to support reading and writing Twitter style JSON files
Java
1
star
50

gateplugin-ANNIE

Java
1
star
51

gateplugin-Twitter

A suite of tools designed for processing Tweets
Java
1
star
52

gateplugin-Ontology_Tools

Java
1
star
53

gateplugin-Sentiment

Provides resources for Sentiment Analysis in GATE
Groovy
1
star
54

gateplugin-Ontology

Ontology support for GATE
Java
1
star
55

youtube-scraper

Scrape Youtube Data
Python
1
star
56

UNGA-search

Exploration webapp for the UN GA Mímir index.
CSS
1
star
57

cloud-client

Client library for the GATE Cloud REST APIs
Java
1
star
58

cluster-brown4wikipedia

Tools to simplify creating brown clusters from Wikipedia dump files
Python
1
star
59

gateplugin-Groovy

Adds support for the Groovy scripting language to GATE as well as making GATE easier to use from Groovy scripts
Java
1
star
60

gate-lf-pytorch-json

PyTorch wrapper for the LearningFramework GATE plugin
Python
1
star
61

gateplugin-Java

A plugin for the GATE language technology framework that allows on-the fly use of Java programs as Processing Resources
Java
1
star
62

gateplugin-UNGA

Information extraction for United Nations General Assembly Resolutions
Python
1
star
63

corpusconversion-conll2003

Tool/scripts to help converting the CoNLL 2003 corpora to GATE format
Scala
1
star
64

gateplugin-Tagger_TagMe

GATE NLP plugin for the TagMe service
Java
1
star
65

gateplugin-DocumentNormalizer

Tools for normalizing documents before processing
Java
1
star
66

sklearn-wrapper

A lightweight wrapper around scikit-learn for the GATE LearningFramework plugin
Python
1
star
67

weka-wrapper

A very lightweight wrapper around Weka
Java
1
star
68

gateplugin-Format_DataSift

Document Format plugin to support reading DataSift JSON files
Java
1
star
69

gateplugin-StringAnnotation

A plugin for the GATE language technology framework that provides gazetteer and regular expression annotator PRs for string annotation
Java
1
star
70

gateplugin-CISTEM

A GATE wrapper around the CISTEM German Stemmer (see https://github.com/LeonieWeissweiler/CISTEM)
Java
1
star
71

gate-lf-keras-sparse

A lightweight wrapper around keras mainly for use with the GATE LearningFramework plugin
Python
1
star