Peter Baumgartner (@pmbaumgartner)

Top repositories

1

spacy-html-tokenizer

Python
62
star
2

text-feat-lib

Provide a comprehensive list of tokenizers, features, and general NLP things used for text analysis with examples. The initial focus is on features used for twitter data and sentiment analysis.
Jupyter Notebook
46
star
3

binder-notebooks

Notebooks configured to be run with Binder, usually found on my blog.
Jupyter Notebook
41
star
4

setfit

Python
38
star
5

syntax-speaker-prediction

The tastiest machine learning project. Can we predict who is speaking for how long during an episode of the syntax.fm podcast?
Jupyter Notebook
36
star
6

spacy-setfit-textcat

Python
28
star
7

clabel

A utility for labeling clusters of text data.
Python
28
star
8

streamlitopedia

Collection of code snippets and utilities for streamlit apps
HTML
22
star
9

remerge-mwe

REMERGE - Multi-Word Expression discovery algorithm
Python
14
star
10

nyc-test-tracker

Python
14
star
11

excelify

IPython Magic for exporting pandas objects to Excel
Python
13
star
12

spacy-v3-project-startup

Python
12
star
13

open-tda

An attempt to replicate Ayasdi topoligical data analysis software with open source tools.
Jupyter Notebook
9
star
14

voronoiville

Rust
9
star
15

embuddy

`embuddy` is a package that helps with using text embeddings for local data analysis.
Python
8
star
16

prodigy-iaa

Python
7
star
17

dank-data-explorer

Example streamlit app for demoing concepts from Streamlitopedia.
Python
6
star
18

twitter-ira-network-notebooks

This repository contains a few notebooks with some light commentary on using graph-tool for network analysis with the Twitter IRA Data.
Jupyter Notebook
6
star
19

abm-social-distancing

HTML
5
star
20

boots

A tiny statistical bootstrapping library.
Python
5
star
21

spacy-vscode-utils

A collection of utilities to assist with using spaCy in VSCode
Shell
5
star
22

demo-ard-text

Demonstration notebooks for processes to reduce volume of news articles relative to some topic. Runnable in Binder.
Jupyter Notebook
5
star
23

Python-for-SAS-Users

This repository contains several IPython Notebooks to help translate from the SAS statistical language to equivalents using Python libraries.
Jupyter Notebook
4
star
24

general-inquirer-remix

Python
4
star
25

oatmeal

CLI for training multiclass and multilabel text classification models with BERT.
Python
3
star
26

dispenv

A CLI tool for creating disposable environments.
Python
3
star
27

spacy-project-viz

Python
3
star
28

spacy-experimental-typing

Python
3
star
29

corpus_statistics

Python
2
star
30

pypdfium-explore

Jupyter Notebook
2
star
31

fasttext-lite

Python
2
star
32

flask-ask

testing out flask-ask for programming with alexa
Python
1
star
33

personal-pelican-site

Jupyter Notebook
1
star
34

pdf-sketches

Jupyter Notebook
1
star
35

adl-datadive

Data Dive for ADL
Jupyter Notebook
1
star
36

spacy-altair-theme

A theme for making altair plots matching the spaCy brand.
Python
1
star
37

nav-labeled-data

HTML
1
star
38

prodigy-iaa-poc

proof of concept for IAA/IRR metrics in a Prodigy recipe
Python
1
star
39

altair-saver-playwright

An easier to install version of `altair_saver`
Python
1
star
40

graph-tool-pip

tiagopeixoto/graph-tool image with pip installed
Dockerfile
1
star
41

publaynet-data

Python
1
star
42

irr-bootstrap-sim

Jupyter Notebook
1
star