Joseph Turian (@turian)

Top repositories

1

neural-language-model

Implementation of neural language models, in particular Collobert + Weston (2008) and a stochastic margin-based version of Mnih's LBL.
Python
178
star
2

textSNE

2-d visualization of high-dimensional input: Python code for rendering t-SNE code with text labels for each point
Python
107
star
3

topia.termextract

Updates to Zope's keyphrase extractor (forked from 1.1.0)
Python
67
star
4

crfchunking-with-wordrepresentations

Train a CRF for syntactic chunking (CoNLL2000), and use word representations
Python
43
star
5

common

Common Python library, especially for text processing and controlling experimental runs
Python
42
star
6

kea-service

KEA 5.0 (keyphrase extraction software), modified to be an XML-RPC service
Shell
42
star
7

pytextpreprocess

Preprocess text for NLP (tokenizing, lowercasing, stemming, sentence splitting, etc.)
Python
29
star
8

random-indexing-wordrepresentations

Induce word representations using random indexing (RI)
Python
29
star
9

save-my-browser-tabs

Extension for Mozilla Firefox and Google Chrome to save all of your open tabs to a text file (window/tab index, URL and title of each tab)
JavaScript
27
star
10

stanford-pos-tagger-service

XML-RPC version of the Stanford POS tagger
Python
21
star
11

common-scripts

Common scripts, mainly for text processing and experimental control
Python
20
star
12

pyrandomprojection

Random projection library for Python, converting a dictionary to low-dimensional numpy matrix
Python
18
star
13

donatefaces

Extract faces from video clips; generate training data for pose-invariant face features
Python
17
star
14

py80legsformat

In Python, read the .80 file format, for 80legs web crawl results.
Python
12
star
15

fatfreecrm-ec2

Deploy FatFree CRM on EC2
Shell
10
star
16

scikits.learn.recipes

Recipes for scikits.learn
Python
9
star
17

batchtrain

Find the best model, using random hyperparameter optimization, using scikit-learn
Python
9
star
18

parser-model

A neural network with a sparse input, for predicting decisions of a natural language syntax parser.
Python
8
star
19

django-instantmessage

IM-like application for Pinax social networks (Django), that allow you to see which friends are online and chat them
8
star
20

simple-twitter-similarity

Didactic example of information retrieval, computing the similarity of two twitter users
6
star
21

pytc-example

Example code for pytc (Python TokyoCabinet API)
Python
6
star
22

osqa

OSQA branch, with some fixes
Python
6
star
23

flickorpus

flickorpus collects an image and tag corpus from flickr.
Python
6
star
24

biased-text-sample

Perform a biased sample of text data
Python
5
star
25

pycrowdflower

Python code for accessing the CrowdFlower API
5
star
26

wikiprep-postprocess

Postprocess XML output from wikiprep (Wikipedia preprocessor) into JSON
Python
5
star
27

query-classification-with-word-representations

KDDCup 2005 query classification with word representations
5
star
28

flann-1.2

Fork of FLANN 1.2, Fast Library for Approximate Nearest Neighbors
Python
5
star
29

osqa-install-webfaction

Install OSQA on webfaction
Python
5
star
30

wordrepresentations-hmm

HMM model for word representations, using the method of Huang + Yates (2009).
4
star
31

fabricrecipes

fabric recipes, primarily for deploying Ubuntu and EC2 instances.
Python
4
star
32

doubleblind

Django project to do blind testing and figure out which of your friends post things you actually like
Python
4
star
33

renderman-dexed-linux

Instructions for using the RenderMan Python API for controlling the Dexed FM synthesizer on Linux
Python
4
star
34

sounder

Tinder for discovering music
JavaScript
4
star
35

search-autocomplete

Javascript autocomplete, with MySQL/PHP backend
3
star
36

pyshortstringcompression

Compress short strings, using the Huffman algorithm.
3
star
37

audio-discrimination-crowdsource-batch

Batch processing for audio-discrimination-crowdsource
Python
3
star
38

inverse-audio-synthesis

Inverse audio synthesis
Python
3
star
39

language-model-linear

A neural language model, intended to produce embeddings for a linear classifier
3
star
40

pitch-detection-echonest

Pitch detection, for an audio file, using the Echonest remix API
Python
3
star
41

soundcloudsampler

A widget to help you quickly sample soundcloud tracks.
JavaScript
3
star
42

python-SimpleXMLRPCServer-permissive

A permissive version of the Python SimpleXMLRPCServer, which can correct errant XML input from the client.
Python
3
star
43

vworker-select-all-workers-firefox-extension

Firefox extension to select all workers in vWorker search results page
JavaScript
3
star
44

osqa-jsmath

jsMath support for OSQA
3
star
45

pycrunchbase

Python methods to interact with the Crunchbase API v1.
2
star
46

openl3_numpy_weights

OpenL3 audio model weights, in numpy format
2
star
47

transformer-fsd50k

HUBERT or wav2vec2 pretrained on FSD50K
2
star
48

lisadiary

A bliki (blog+wiki) compiler, inspired by ikiwiki
2
star
49

grab-wikipedia-abstracts

Grab all Wikipedia abstracts, in all languages
2
star
50

aucoder

Python
2
star
51

writing-collaboration

An article about scientific collaboration
2
star
52

audio-discrimination-crowdsource

Web service to crowd-source audio discrimination data
CSS
2
star
53

datasciencepatterns

1
star
54

audiojnd

Audio pair JND
Python
1
star
55

kinda-deep

Technical blog
JavaScript
1
star
56

sherlock-rest

A Django JSON REST API for Sherlock
Python
1
star
57

embeddingcache

Retrieve text embeddings, but cache them locally if we have already computed them.
Python
1
star
58

query-categorization-with-word-representations

KDDCup 2005 query classification with word representations
1
star
59

dx7render-docker

Render dx7 patches, dockerized
Dockerfile
1
star
60

archivebox-render

ArchiveBox blueprint for Render
1
star
61

batch-elki-cluster

1
star
62

grokmusic

Grok your music collection, and save it into a persistent format.
Python
1
star