• Stars
    star
    12
  • Rank 1,597,372 (Top 32 %)
  • Language
    Go
  • License
    MIT License
  • Created about 11 years ago
  • Updated about 11 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

porter stemmer

Porter Stemmer for Go

This is a fairly straighforward port of Martin Porter's C implementation of the Porter stemming algorithm. The C version this port is based on is available for download here: http://tartarus.org/~martin/PorterStemmer/c_thread_safe.txt

The original algorithm is described in the paper:

M.F. Porter, 1980, An algorithm for suffix stripping, Program, 14(3) pp
130-137.

While the internal implementation and interface is nearly identical to the original implementation, the Go interface is much simplified. The stemmer can be called as follows:

import "porter"
...
stemmed := porter.Stem(word_to_stem)

Installing

go get github.com/a2800276/porter

to use the stemmer when installed using goinstall, import:

import "github.com/a2800276/porter"

Limitations

While the implementation is fairly robust, this is a work in progress. In particular, a new interface will likely be provided to prevent excessive conversions between strings and []byte. Currently, on calling Stem the string argument is converted to a byte slice which the algorithm works on and is converted back into a string before returning.

Also, the implementation is not particularly robust at handling Unicode input, currently, only bytes with the high bit set are ignored. It's up to the caller to make sure the string contains only ASCII characters. Since the algorithm itself operates on English words only, this doens't restrict the functionality, but it is nuisance.

TODO:

  • byte slice API to void roundtripping to string and back

More Repositories

1

hexy.js

hex pretty printing for javascript (node & browser)
JavaScript
94
star
2

8583

ruby implementation of iso 8583 financial messages
Ruby
40
star
3

bncode

bencoding (bittorrent) in javascript
JavaScript
30
star
4

http-parser.java

java port of ry's http-parser
C
20
star
5

realworldocaml_epub

Electronic Book version of O'Reilly's Real World OCaml
Ruby
8
star
6

7816

ISO 7816 stuff
Ruby
7
star
7

des.go

a DES implementation in go
Go
5
star
8

shapefile

pure go (golang) shapefile reader
Go
5
star
9

kiva

wrapper to kiva web api
Ruby
5
star
10

29c3

29c3 - demo for my talk at 29c3
Ruby
5
star
11

tlv

TLV utils.
Ruby
5
star
12

event.java

experimental java event loop to make nio bareable.
Java
4
star
13

smartcard

Interface for ISO 7816 Smart Cards
C
4
star
14

hexy

utils for printing hex dumps.
Ruby
4
star
15

terminal.js

dec terminal statemachine (and possibly browser impl) in javascript
JavaScript
3
star
16

simplehttp

http lib
Ruby
3
star
17

jnapcsc

A Java Binding to PCSC (Smartcard access) using JNA
Java
2
star
18

sprintf

javascript sprintf utility
JavaScript
2
star
19

primitive

Collections for Java primitive types.
Java
2
star
20

dns.js

playing around with dns
JavaScript
2
star
21

jsxmlRPC

xmlrpc implementation for javascript.
Ruby
2
star
22

rtemplatemaker

port of python templatemaker to ruby
C
2
star
23

porter-stemmer.go

Porter Stemmer ported to Go
Go
2
star
24

countries

basic information about different countries and currencies (mainly iso 3166 and 4217)
Ruby
1
star
25

bytes.rb

byte stuff
1
star
26

cmdline

Yet Another Ruby Commandline Utility
Ruby
1
star
27

gcodejs

parsing gcode in javascript
JavaScript
1
star
28

geoip

tiny java server to resolve ip addr (v4) to countries
Java
1
star
29

gaml

simplified haml for go
Go
1
star
30

chimp

Make a cheap impression. Present via terminal.
Ruby
1
star
31

json.java

simple json parser for java
Java
1
star