• Stars
    star
    1,365
  • Rank 34,436 (Top 0.7 %)
  • Language
    Ruby
  • License
    Other
  • Created almost 13 years ago
  • Updated over 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Natural language processing framework for Ruby.

Build Status Code Climate

Treat Logo

New in v2.0.5: OpenNLP integration and Yomu support

Treat is a toolkit for natural language processing and computational linguistics in Ruby. The Treat project aims to build a language- and algorithm- agnostic NLP framework for Ruby with support for tasks such as document retrieval, text chunking, segmentation and tokenization, natural language parsing, part-of-speech tagging, keyword extraction and named entity recognition. Learn more by taking a quick tour or by reading the manual.

Features

  • Text extractors for PDF, HTML, XML, Word, AbiWord, OpenOffice and image formats (Ocropus).
  • Text chunkers, sentence segmenters, tokenizers, and parsers (Stanford & Enju).
  • Lexical resources (WordNet interface, several POS taggers for English).
  • Language, date/time, topic words (LDA) and keyword (TF*IDF) extraction.
  • Word inflectors, including stemmers, conjugators, declensors, and number inflection.
  • Serialization of annotated entities to YAML, XML or to MongoDB.
  • Visualization in ASCII tree, directed graph (DOT) and tag-bracketed (standoff) formats.
  • Linguistic resources, including language detection and tag alignments for several treebanks.
  • Machine learning (decision tree, multilayer perceptron, LIBLINEAR, LIBSVM).
  • Text retrieval with indexation and full-text search (Ferret).

Contributing

I am actively seeking developers that can help maintain and expand this project. You can find a list of ideas for contributing to the project here.

Authors

Lead developper: @louismullie [Twitter]

Contributors:

  • @bdigital
  • @automatedtendencies
  • @LeFnord
  • @darkphantum
  • @whistlerbrk
  • @smileart
  • @erol

License

This software is released under the GPL License and includes software released under the GPL, Ruby, Apache 2.0 and MIT licenses.

More Repositories

1

stanford-core-nlp

Ruby bindings to the Stanford Core NLP tools (English, French, German).
Ruby
433
star
2

open-nlp

Ruby bindings to the OpenNLP Java toolkit.
Ruby
91
star
3

graph-rank

Ruby implementation of the PageRank and TextRank algorithms.
Ruby
75
star
4

scalpel

A fast and accurate rule-based sentence segmentation tool for Ruby.
Ruby
51
star
5

watershed-cuda

An implementation of the watershed algorithm in CUDA.
Python
30
star
6

erc-js

A Javascript implementation of Reed-Solomon error correcting codes.
JavaScript
27
star
7

aes-js

Fast and slim Javascript implementation of AES in ECB and CTR modes
JavaScript
15
star
8

bind-it

BindIt is a tool to facilitate the creation of Java bindings in Ruby.
Ruby
12
star
9

ope-rb

Ruby implementation of Boldyreva's order-preserving encryption scheme
Ruby
12
star
10

hom-js

Javascript implementation of the Paillier additive homomorphic cryptosystem
JavaScript
9
star
11

repos

A list of my open-source libraries and algorithm implementations
8
star
12

syme

Syme: a decentralized key infrastructure and message exchange platform.
JavaScript
8
star
13

closet

Carefully crafted skeleton for Sinatra applications.
Ruby
8
star
14

emcee

Markov chains for rap lyric generation.
Ruby
7
star
15

kademlia-webrtc

JS implementation of the Kademlia DHT on top of WebRTC and IndexedDB
JavaScript
7
star
16

graphr

Graph-related Ruby classes.
Ruby
6
star
17

schiphol

A Ruby downloader script with progress bar, retries, 301/2 following and MIME type-based format detection.
Ruby
6
star
18

web-ct-segmentation

5
star
19

tf-idf-emr

MapReduce implementation of TF*IDF with on Amazon EMR
Python
4
star
20

drbg-rb

Cryptographically secure deterministic random bit generators for Ruby
Ruby
4
star
21

birch

A Ruby tree implementation with an optional C extension for speed.
C
3
star
22

covidb

Multimedia database of COVID cases.
HTML
2
star
23

siv-rb

Ruby C extension for the AES-SIV deterministic encryption mode (RFC 5297)
C
2
star
24

cmac-rb

Ruby C extension for the AES-CMAC keyed hash function (RFC 4493)
C
2
star
25

rex-rac

A Rex-/Racc Ruby parser skeleton
Ruby
1
star