David S. Batista (@davidsbatista)

Top repositories

1

Annotated-Semantic-Relationships-Datasets

A collections of public and free annotated datasets of relationships between entities/nominals (Portuguese and English)
683
star
2

NER-datasets

Datasets to train supervised classifiers for Named-Entity Recognition in different languages (Portuguese, German, Dutch, French, English)
Python
337
star
3

NER-Evaluation

An implementation of a full named-entity evaluation metrics based on SemEval'13 Task 9 - not at tag/token level but considering all the tokens that are part of the named-entity
Python
213
star
4

Snowball

Implementation with some extensions of the paper "Snowball: Extracting Relations from Large Plain-Text Collections" (Agichtein and Gravano, 2000)
Python
177
star
5

BREDS

"Bootstrapping Relationship Extractors with Distributional Semantics" (Batista et al., 2015) in EMNLP'15 - Python implementation
Python
145
star
6

Aspect-Based-Sentiment-Analysis

Aspect-Based Sentiment Analysis Experiments
Python
133
star
7

text-classification

An example on how to train supervised classifiers for multi-label text classification using sklearn pipelines
Jupyter Notebook
110
star
8

ConvNets-for-Sentence-Classification

"Convolutional Neural Networks for Sentence Classification" (Kim 2014) - https://www.aclweb.org/anthology/D14-1181
Jupyter Notebook
54
star
9

machine-learning-notebooks

Assorted exercises and proof-of-concepts to understand and study machine learning and statistical learning theory
Jupyter Notebook
44
star
10

lexicons

Dictionaries of names, surnames, acronyms and it's extensions, stop-words, etc., which I gathered for different experiments.
29
star
11

TAC-Entity-Linking

An entity linking prototype, developed using the datasets from the TAC-KBP sub-task.
Java
28
star
12

awesome-Portuguese-NLP

A list of libraries and NLP projects for Portuguese
19
star
13

information-extraction-PT

An example of triples extraction with PoS-tags using ReVerb
Python
16
star
14

REACTION-resources

Resources developed by and for the project REACTION (Retrieval, Extraction and Aggregation Computing Technology for Integrating and Organizing News) an initiative for developing a computational journalism platform (mostly) for Portuguese.
9
star
15

StanfordNER-experiments

Python
8
star
16

SLANG-Sequence-LAbeliNG

Sequence LAbeliNG with Neural Networks: "Neural Architectures for Named Entity Recognition" (Lample et al., 2016) and "End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF" (Ma, 2016)
Jupyter Notebook
7
star
17

Toponym-Disambiguation-Using-Ontology-Based-Semantic-Similarity

Toponym Disambiguation using Ontology-based Semantic Similarity.
Python
6
star
18

coding-exercises

A repository of coding interview questions and solutions
Python
6
star
19

MuSICo

A Minwise Hashing Method for Addressing Relationship Extraction from Text
Java
5
star
20

bash-shell-utils

bash scripts, sed examples, and other stuff that I need from time to time
Shell
4
star
21

Temporal-Information-Datasets

3
star
22

minhash-classifier

supervised relationship extraction based on min-hash and locality sensitive hashing
Python
3
star
23

NER-English-Gigaword-LDC

Python scripts to parse the Gigaword collection and perform NER tagging with StanfordNER
Python
3
star
24

dbpedia-webapps

Simple webapps, relying on DBpedia as a data-source.
Python
3
star
25

GermEval-2019-Task_1

GermEval 2019 Task 1 - Shared Task on Hierarchical Classification of Blurbs
Python
2
star
26

ml-report-kit

A plug-in to generate various evaluation metrics and reports ( PR-curves, classifications reports, confusion matrix) for supervised machine learning models using only two lines of code.
Python
2
star
27

nostalgia

Old projects of mine, done during high-school or university and found in old hard-drives
HTML
1
star
28

politiquices

Explore relações de apoio e oposição, entre personalidades políticas, expressas em títulos de notícias preservadas no arquivo.pt
1
star
29

Snowball-Java

Snowball: Extracting Relations from Large Plain-Text Collections
Java
1
star
30

GermEval-2017-Aspect-Based-Sentiment-Analysis

Python
1
star
31

davidsbatista.net

my personal homepage and blog
HTML
1
star