• Stars
    star
    13
  • Rank 1,512,713 (Top 30 %)
  • Language
    Python
  • Created over 14 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Some command-line utilities, mostly for data manipulation and inspection.

More Repositories

1

ark-tweet-nlp

CMU ARK Twitter Part-of-Speech Tagger
Java
575
star
2

tweetmotif

Topical search for Twitter. See twokenize.py, emoticons.py for tokenization.
Python
162
star
3

stanford_corenlp_pywrapper

Java
151
star
4

tsvutils

Utilities for processing tab-separated files
Python
127
star
5

awkspeed

Speed testing for a data munging task
C++
44
star
6

arkref

http://www.ark.cs.cmu.edu/ARKref/
Java
32
star
7

scalacheat

cheat sheet for scala syntax
Shell
32
star
8

parseviz

Visualize constituent and dependency parses as PDF or image formats, through GraphViz.
Python
31
star
9

OConnor_IREvents_ACL2013

Replication software, data, and supplementary materials for the paper: O'Connor, Stewart and Smith, ACL-2013, "Learning to Extract International Relations from Political Context"
C++
26
star
10

mte

MiTextExplorer - interactive browser of text and document covariates.
Java
24
star
11

myutil

Java
23
star
12

dlanalysis

a bunch of R code for various statistical analyses
R
21
star
13

conplot

Console ascii art plotter - quick-and-dirty data visualization, e.g. for log statistics
Python
18
star
14

running_stat

Running variance / standard deviation calculation (C++ and Python)
Python
14
star
15

muc4_proc

preprocessing of the MUC4 dataset
Python
11
star
16

bow

A patched version of bow & rainbow 20020213 that compiles with modern gcc 4.0.1, OSX 10.5
C
11
star
17

twitter_geo_preproc

A preprocessing script to get geo-coded tweets from the Streaming API
Python
9
star
18

gfl_syntax

Graph Fragment Language for Easy Syntactic Annotation
Python
8
star
19

nlp_jobs

research code from rion and brendan when writing snow, o'connor, jurafsky, ng EMNLP-2008 "cheap and fast, but is it good?"
Ruby
6
star
20

stanfordnlp-util

java utilities for stanford nlp
Java
5
star
21

gigaword_conversion

Python
3
star
22

glmnet_starter

Starter code for the glmnet package (elastic net regressions)
R
2
star
23

slmunge

Scripts to munge certain machine learning sparse data formats, including SVMLight/LibSVM
Python
2
star
24

twitter_geo_viz

REALLY HALFBAKED DO NOT USE YOU MAY CRASH OUR SERVER
JavaScript
2
star
25

namefreedom

data and analysis of country names versus democratic freedoms
2
star
26

viewdb

HTML report of an SQL DB's schema and data
Python
1
star
27

super_tuesday_2020

analysis of Super Tuesday exit poll data
HTML
1
star
28

flex-for-morpha

Patched version of GNU Flex 2.5.35 to compile "morpha"
C
1
star
29

beta_explorer

1
star
30

flightstats

Python
1
star
31

randomsearch

web app to randomly choose which search engine to use per query
Python
1
star