There are no reviews yet. Be the first to send feedback to the community and the maintainers!
CMU ARK Twitter Part-of-Speech Tagger v0.3.2 http://www.ark.cs.cmu.edu/TweetNLP/ Basic usage for released version ================================ Requires Java 6. To run the tagger on example data, try: java -Xmx500m -jar ark-tweet-nlp-0.3.2.jar examples/example_tweets.txt where the jar file is the one included in the release download. The tagger outputs tokens, predicted part-of-speech tags, and confidences. Use the "--help" flag for more information. On Unix systems, "./runTagger.sh" invokes the tagger; e.g. ./runTagger.sh examples/example_tweets.txt ./runTagger.sh --help We also include a script that invokes just the tokenizer: ./twokenize.sh examples/example_tweets.txt You may have to adjust the parameters to "java" depending on your system. If instead you are using a source checkout, see docs/hacking.txt for info. Information =========== Version 0.3 of the tagger is much faster and more accurate. Please see the tech report on the website for details. For the Java API, see src/cmu/arktweetnlp; especially Tagger.java. See also documentation in docs/ and src/cmu/arktweetnlp/package.html. This tagger is described in the following two papers, available at the website. Please cite these if you write a research paper using this software. Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments Kevin Gimpel, Nathan Schneider, Brendan O'Connor, Dipanjan Das, Daniel Mills, Jacob Eisenstein, Michael Heilman, Dani Yogatama, Jeffrey Flanigan, and Noah A. Smith In Proceedings of the Annual Meeting of the Association for Computational Linguistics, companion volume, Portland, OR, June 2011. http://www.ark.cs.cmu.edu/TweetNLP/gimpel+etal.acl11.pdf Part-of-Speech Tagging for Twitter: Word Clusters and Other Advances Olutobi Owoputi, Brendan O'Connor, Chris Dyer, Kevin Gimpel, and Nathan Schneider. Technical Report, Machine Learning Department. CMU-ML-12-107. September 2012. Contact ======= Please contact Brendan O'Connor ([email protected]) and Kevin Gimpel ([email protected]) if you encounter any problems.
tweetmotif
Topical search for Twitter. See twokenize.py, emoticons.py for tokenization.stanford_corenlp_pywrapper
tsvutils
Utilities for processing tab-separated filesawkspeed
Speed testing for a data munging taskarkref
http://www.ark.cs.cmu.edu/ARKref/scalacheat
cheat sheet for scala syntaxparseviz
Visualize constituent and dependency parses as PDF or image formats, through GraphViz.OConnor_IREvents_ACL2013
Replication software, data, and supplementary materials for the paper: O'Connor, Stewart and Smith, ACL-2013, "Learning to Extract International Relations from Political Context"mte
MiTextExplorer - interactive browser of text and document covariates.myutil
dlanalysis
a bunch of R code for various statistical analysesconplot
Console ascii art plotter - quick-and-dirty data visualization, e.g. for log statisticsrunning_stat
Running variance / standard deviation calculation (C++ and Python)cmdutils
Some command-line utilities, mostly for data manipulation and inspection.muc4_proc
preprocessing of the MUC4 datasetbow
A patched version of bow & rainbow 20020213 that compiles with modern gcc 4.0.1, OSX 10.5twitter_geo_preproc
A preprocessing script to get geo-coded tweets from the Streaming APIgfl_syntax
Graph Fragment Language for Easy Syntactic Annotationnlp_jobs
research code from rion and brendan when writing snow, o'connor, jurafsky, ng EMNLP-2008 "cheap and fast, but is it good?"stanfordnlp-util
java utilities for stanford nlpgigaword_conversion
glmnet_starter
Starter code for the glmnet package (elastic net regressions)slmunge
Scripts to munge certain machine learning sparse data formats, including SVMLight/LibSVMtwitter_geo_viz
REALLY HALFBAKED DO NOT USE YOU MAY CRASH OUR SERVERnamefreedom
data and analysis of country names versus democratic freedomsviewdb
HTML report of an SQL DB's schema and datasuper_tuesday_2020
analysis of Super Tuesday exit poll dataflex-for-morpha
Patched version of GNU Flex 2.5.35 to compile "morpha"beta_explorer
flightstats
randomsearch
web app to randomly choose which search engine to use per queryLove Open Source and this site? Check out how you can help us