Awesome Community-Curated NLP List
To contribute: This list is community curated, anyone can do a pull-request to add to the list. And it will be merged once 5 person have verified that the PR is not spam.
Speech NLP
-
Automatic Speech Recognition (Speech-to-Text)
-
Speech Synthesis (Text-to-Speech)
-
Generic Speech Analysis/Modelling Tools
-
List of Lists of Speech tools
Text NLP Suites
- NLTK
- Gensim
- SpaCy
- Stanford CoreNLP
- Freeling
- OpenNLP
- DKPro
- PyNLPl
- IXA Pipes
- NLP4J
- CogComp's NLP libraries
- Stanbol NLP
- LIMA
- Corpus.Tools
- NooJ
- SALAT
Language Specific Text NLP Suites
-
Arabic
- SAFAR: Software Architecture For Arabic language pRocessing
-
Cantonese
- PyCantonese: Cantonese Linguistics and NLP in Python
-
Chinese
- SnowNLP: Simplified Chinese Text Processing
-
Persian
- Hazm: Python library for digesting Persian text.
-
Dutch
- Frog: An advanced NLP suite for Dutch
-
Italian
- Tint: Lend color to your Italian texts!
-
Korean
- KoNLPy: Korean NLP in Python
Pre-processing (Tokenization / Stemming / POS Tagging / etc.)
-
Ngrams
- Colibri Core - C++ and Python tools for n-grams and skipgrams
-
Stemming
-
Tokenizers
Deep Linguistic Processing
The deep here isn't "deep learing" deep ;P , see https://en.wikipedia.org/wiki/Deep_linguistic_processing
-
Head-drive Phrase Structure Grammar (HPSG)
- DELPH-IN: Deep Linguistic Processing with HPSG
- English Resource Grammar
-
Combinatory Categorial Grammar (CCG)
- CCG2PST : A tool for converting CCG derivations into PTB-style phrase structure trees
Word Embeddings
Task Specific
-
Entity Linking / Relation Extraction
-
Coreference
- Berkley Coreference Error Analyser - A tool for classifying errors in coreference resolution
-
Parsing
- Berkley Parse Error Analyser: A tool for classifying mistakes in the output of parsers
Machine Translation
-
Neural MT
-
Phrased-based MT
-
Rule-based MT
-
Example-based MT
-
MT-related tools
-
MT List of lists
Language Modelling
Annotation Related
-
Annotation Platforms
- Brat Rapid Annotation Tool: Online environment for collaborative text annotation
- PyBossa: The ultimate crowdsourcing framework
- FLAT: FoLiA Linguistic Annotation Tool
-
Annotation Toools
- TableAnnotator and TabInOut
- Marvin: Semantic text annotation tools using Wordnet and DBPedia
Others
-
Author Attribution
-
Orthography
-
NLP API / Workflow
- CLAM: Turn command-line applications into RESTful webservices with web front-end.
- LuigiNLP: Experimental NLP Pipeline system built on top of SciLuigi
- TextFlows / ClowdFlows
-
Misc
NLP Related Machine Learning Tools
List of Lists of NLP Resources/Tools
- Awesome NLP (The original one, curated by @keon and @outpark)
- Repo tagged with
nlp
on Github.com - Java or Python for NLP?
- OpenSource Deep QA Resources (also nice talk)
- Sibawayh Repository for Arabic NLP
- @proycon La Machine
- Ruby NLP Resources/Tools
Dataset Lists
- @niderhoff NLP Datasets
- @karthikncode NLP Datasets