Go package to calculate the Levenshtein Distance
The library is fully capable of working with non-ascii strings. But the strings are not normalized. That is left as a user-dependant use case. Please normalize the strings before passing it to the library if you have such a requirement.
As a performance optimization, the library can handle strings only up to 65536 characters (runes). If you need to handle strings larger than that, please pin to version 1.0.3.
go get github.com/agnivade/levenshtein
package main
import (
"fmt"
"github.com/agnivade/levenshtein"
)
func main() {
s1 := "kitten"
s2 := "sitting"
distance := levenshtein.ComputeDistance(s1, s2)
fmt.Printf("The distance between %s and %s is %d.\n", s1, s2, distance)
// Output:
// The distance between kitten and sitting is 3.
}
name time/op
Simple/ASCII-4 330ns ± 2%
Simple/French-4 617ns ± 2%
Simple/Nordic-4 1.16µs ± 4%
Simple/Tibetan-4 1.05µs ± 1%
name alloc/op
Simple/ASCII-4 96.0B ± 0%
Simple/French-4 128B ± 0%
Simple/Nordic-4 192B ± 0%
Simple/Tibetan-4 144B ± 0%
name allocs/op
Simple/ASCII-4 1.00 ± 0%
Simple/French-4 1.00 ± 0%
Simple/Nordic-4 1.00 ± 0%
Simple/Tibetan-4 1.00 ± 0%
name time/op
Leven/ASCII/agniva-4 353ns ± 1%
Leven/ASCII/arbovm-4 485ns ± 1%
Leven/ASCII/dgryski-4 395ns ± 0%
Leven/French/agniva-4 648ns ± 1%
Leven/French/arbovm-4 791ns ± 0%
Leven/French/dgryski-4 682ns ± 0%
Leven/Nordic/agniva-4 1.28µs ± 1%
Leven/Nordic/arbovm-4 1.52µs ± 1%
Leven/Nordic/dgryski-4 1.32µs ± 1%
Leven/Tibetan/agniva-4 1.12µs ± 1%
Leven/Tibetan/arbovm-4 1.31µs ± 0%
Leven/Tibetan/dgryski-4 1.16µs ± 0%