• This repository has been archived on 20/Dec/2020
  • Stars
    star
    46
  • Rank 601,591 (Top 13 %)
  • Language
    Jupyter Notebook
  • Created over 8 years ago
  • Updated over 8 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Provide a comprehensive list of tokenizers, features, and general NLP things used for text analysis with examples. The initial focus is on features used for twitter data and sentiment analysis.

More Repositories

1

spacy-html-tokenizer

Python
62
star
2

binder-notebooks

Notebooks configured to be run with Binder, usually found on my blog.
Jupyter Notebook
41
star
3

setfit

Python
38
star
4

syntax-speaker-prediction

The tastiest machine learning project. Can we predict who is speaking for how long during an episode of the syntax.fm podcast?
Jupyter Notebook
36
star
5

spacy-setfit-textcat

Python
28
star
6

clabel

A utility for labeling clusters of text data.
Python
28
star
7

streamlitopedia

Collection of code snippets and utilities for streamlit apps
HTML
22
star
8

remerge-mwe

REMERGE - Multi-Word Expression discovery algorithm
Python
14
star
9

nyc-test-tracker

Python
14
star
10

excelify

IPython Magic for exporting pandas objects to Excel
Python
13
star
11

spacy-v3-project-startup

Python
12
star
12

open-tda

An attempt to replicate Ayasdi topoligical data analysis software with open source tools.
Jupyter Notebook
9
star
13

voronoiville

Rust
9
star
14

embuddy

`embuddy` is a package that helps with using text embeddings for local data analysis.
Python
8
star
15

prodigy-iaa

Python
7
star
16

dank-data-explorer

Example streamlit app for demoing concepts from Streamlitopedia.
Python
6
star
17

twitter-ira-network-notebooks

This repository contains a few notebooks with some light commentary on using graph-tool for network analysis with the Twitter IRA Data.
Jupyter Notebook
6
star
18

abm-social-distancing

HTML
5
star
19

boots

A tiny statistical bootstrapping library.
Python
5
star
20

spacy-vscode-utils

A collection of utilities to assist with using spaCy in VSCode
Shell
5
star
21

demo-ard-text

Demonstration notebooks for processes to reduce volume of news articles relative to some topic. Runnable in Binder.
Jupyter Notebook
5
star
22

Python-for-SAS-Users

This repository contains several IPython Notebooks to help translate from the SAS statistical language to equivalents using Python libraries.
Jupyter Notebook
4
star
23

general-inquirer-remix

Python
4
star
24

oatmeal

CLI for training multiclass and multilabel text classification models with BERT.
Python
3
star
25

dispenv

A CLI tool for creating disposable environments.
Python
3
star
26

spacy-project-viz

Python
3
star
27

spacy-experimental-typing

Python
3
star
28

corpus_statistics

Python
2
star
29

pypdfium-explore

Jupyter Notebook
2
star
30

fasttext-lite

Python
2
star
31

flask-ask

testing out flask-ask for programming with alexa
Python
1
star
32

personal-pelican-site

Jupyter Notebook
1
star
33

pdf-sketches

Jupyter Notebook
1
star
34

adl-datadive

Data Dive for ADL
Jupyter Notebook
1
star
35

spacy-altair-theme

A theme for making altair plots matching the spaCy brand.
Python
1
star
36

nav-labeled-data

HTML
1
star
37

prodigy-iaa-poc

proof of concept for IAA/IRR metrics in a Prodigy recipe
Python
1
star
38

altair-saver-playwright

An easier to install version of `altair_saver`
Python
1
star
39

graph-tool-pip

tiagopeixoto/graph-tool image with pip installed
Dockerfile
1
star
40

publaynet-data

Python
1
star
41

irr-bootstrap-sim

Jupyter Notebook
1
star