• Stars
    star
    585
  • Rank 73,501 (Top 2 %)
  • Language
    Ruby
  • License
    Other
  • Created over 14 years ago
  • Updated about 9 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Collection of text algorithms. gem install text

A collection of text algorithms.

require 'text'

Levenshtein distance¶ ↑

Text::Levenshtein.distance('test', 'test')
# => 0
Text::Levenshtein.distance('test', 'tent')
# => 1
Text::Levenshtein.distance('test', 'testing')
# => 3
Text::Levenshtein.distance('test', 'testing', 2)
# => 2

Metaphone¶ ↑

Text::Metaphone.metaphone('BRIAN')
# => 'BRN'

Text::Metaphone.double_metaphone('Coburn')
# => ['KPRN', nil]
Text::Metaphone.double_metaphone('Angier')
# => ['ANJ', 'ANJR']

Soundex¶ ↑

Text::Soundex.soundex('Knuth')
# => 'K530'

Porter stemming¶ ↑

Text::PorterStemming.stem('abatements')  # => 'abat'

White similarity¶ ↑

white = Text::WhiteSimilarity.new
white.similarity('Healed', 'Sealed')   # 0.8
white.similarity('Healed', 'Help')     # 0.25

Note that some intermediate information is cached on the instance to improve performance.

Ruby version compatibility¶ ↑

The library has been tested on Ruby 1.8.6 to 1.9.3 and on JRuby.

  • Hampton Catlin (hcatlin) for Ruby 1.9 compatibility work

  • Wilker Lúcio for the initial implementation of the White algorithm

License¶ ↑

MIT. See COPYING.txt for details.

More Repositories

1

htmlentities

HTMLEntities is a simple library to facilitate encoding and decoding of named (ý and so on) or numerical ({ or Ī) entities in HTML and XHTML documents.
Ruby
333
star
2

htmlbeautifier

A normaliser/beautifier for HTML that also understands embedded Ruby. Ideal for tidying up Rails templates.
Ruby
312
star
3

battleship

LRUG Ruby Fight Club for October
Ruby
69
star
4

uk_postcode

UK postcode parsing and validation for Ruby
Ruby
66
star
5

bktree

Burkhard Keller Tree implementation in Ruby
Ruby
38
star
6

tcsst

Test your CSS, because life's too short to click around.
JavaScript
37
star
7

colormath

Colour mathematics (RGB/HSL/blend) library for Ruby
Ruby
36
star
8

iplayer-dl

Download programmes from the BBC iPlayer by spoofing an iPhone
Ruby
34
star
9

kill-or-cure

Classifying all inanimate objects into those that cause cancer and those that prevent it, via the Daily Mail.
Ruby
25
star
10

treetop-example

Getting started with Treetop can be tricky. These two simple examples should help.
Ruby
12
star
11

adproxy

Ad-blocking proxy server
JavaScript
11
star
12

html2tex

HTML to LaTeX converter
Ruby
10
star
13

theyworkforeu

Making the proceedings of the European Parliament more accessible.
Ruby
10
star
14

lilypond-shamisen

Shamisen notation support for LilyPond
LilyPond
8
star
15

config

dotfiles
Vim Script
7
star
16

userscripts

User scripts for less awful browsing
JavaScript
7
star
17

midi-in-out-thru

3.3V MIDI in/out/thru serial interface
7
star
18

pipistrelle

Multi-function Seeeduino XIAO based Eurorack module
C++
7
star
19

furigana-shim

Shim to enable basic <ruby> support in Firefox and other Gecko browsers
JavaScript
7
star
20

circuit_patch_tools

Tools for manipulating Novation Circuit patches
Ruby
6
star
21

l10nizer

Parse, evaluate and localise ERB templates. Like magic!
Ruby
6
star
22

ubuntu-font-config

/etc/fonts for Mac-like font rendering on Ubuntu, including decent results for Japanese, Arabic, etc.
6
star
23

footswitch

A Raspberry Pi Pico MIDI USB footswitch
C++
5
star
24

shuttlexpress

Use a Contour ShuttleXpress in Linux
Scheme
5
star
25

detenc

A lightweight, low-memory character encoding detector suitable for massive files.
Ruby
5
star
26

minidexed-hat

Raspberry Pi Hat to make a MiniDexed synthesiser
HTML
4
star
27

textpattern-export

Export posts and comments from Textpattern
Ruby
4
star
28

tsugaru

Tsugaru shamisen sheet music
LilyPond
4
star
29

back-in-my-day

Brighton Ruby Conference 2023
Ruby
4
star
30

qtrace

Track, trace and profile SQL calls being made in a Rails application.
Ruby
4
star
31

net-rtmp

Partial implementation of RTMP/AMF for Ruby
Ruby
4
star
32

crossword-printer

Print Crossword Compiler XML files
JavaScript
3
star
33

node-websocket-demo

A simple demonstration of using Express with Socket.IO
JavaScript
3
star
34

shoutcast_status

Find out what a Shoutcast station is currently playing.
Ruby
3
star
35

iplayer-dl.net

iplayer-dl in C#
C#
3
star
36

ellipsis

Uses JavaScript to compress/elide long text elements to make them fit in smaller spaces, like OS X's Finder does with file names.
JavaScript
3
star
37

simple_scrobbler

Scrobble tracks to Last.fm without wanting to gnaw your own arm off.
Ruby
3
star
38

ticket-ad-block

Ad-blocking for tickets and boarding passes
Makefile
3
star
39

bin

Things for my ~/bin directory
Ruby
3
star
40

natgal-dl

Download high-resolution images of paintings in the National Gallery collection.
Ruby
3
star
41

detabulator

Extract columnar data from tabulated fixed-width text
Ruby
2
star
42

norns-case

OpenSCAD
2
star
43

qt-analyzer

Quicktime/MPEG4 processing
Ruby
2
star
44

wordle

Experiments with Wordle
Ruby
2
star
45

kunkunshi

工工四
JavaScript
2
star
46

fake-gem

Makes RubyGems think that a specific gem is installed.
Ruby
2
star
47

printable-tools

3D-printable tools
OpenSCAD
2
star
48

accessible_river_timetable

Code to parse Thames Clippers' commuter timetable PDF and produce some more accessible HTML.
Ruby
2
star
49

latex-framework

This is the LaTeX framework I use for writing reports.
2
star
50

redcloth_template

Use Textile or Markdown syntax directly in your views in .red files.
Ruby
2
star
51

1602-backpack

Small backpack convenience board for 1602 and similar LCD displays
1
star
52

ons_graphs

Add graphs to Office of National Statistics data series.
JavaScript
1
star
53

rpa-base

Port/package manager for Ruby software. This is an archive for historical reference.
Ruby
1
star
54

seijun

Simple and clean GTK2 and XFWM4 theme
1
star
55

smart_hash_merge

Merge two Ruby hashes, with some special treatment to merge array values.
Ruby
1
star
56

rotator

Animated HTML page rotator for an information board or build radiator
JavaScript
1
star
57

axocontrol

Axoloti control board with 8 knobs, screen, and encoder
OpenSCAD
1
star
58

name_finder

Simple library to find the longest recognised name in a piece of text.
Ruby
1
star
59

bbc-news

BBC News over-uses quotation marks in headlines. This aggregates them.
Ruby
1
star
60

ansitee

tee, with ANSI control sequence filtering on file output.
C
1
star
61

gutenberg2pdf

Convert Project Gutenberg texts to PDFs for Sony Reader etc.
1
star
62

recaf

JavaScript to CoffeeScript translator. Eventually. Hopefully.
JavaScript
1
star
63

ncsi

Network Connectivity Status Indicator
1
star
64

argo

Turn a JSON Schema into Ruby objects that describe properties and validations
Ruby
1
star
65

byzanz

Fork of http://git.gnome.org/browse/byzanz/ with fixes and Ubuntu packaging files
C
1
star
66

radiobox

said.fm hack weekend project
JavaScript
1
star
67

nabaztag

Ruby Nabaztag library
Ruby
1
star