@saffsd
  • Stars
    star
    2,443
  • Global Rank 11,942 (Top 0.5 %)
  • Followers 73
  • Following 1
  • Registered over 14 years ago
  • Most used languages
    Python
    73.3 %
    ASP
    6.7 %
    JavaScript
    6.7 %
    C
    6.7 %
    C++
    6.7 %

Top repositories

1

langid.py

Stand-alone language identification system
Python
2,228
star
2

kaggle-stackoverflow2012

My entry to the Kaggle 2012 Stack Overflow competition. Ranked 10th on the final public leaderboard.
Python
46
star
3

wikidump

Tools to manipulate and extract data from wikipedia dumps
Python
43
star
4

polyglot

Polyglot is a language identifier for detecting text documents containing text written in more than one language, and for identifying the languages therein.
Python
32
star
5

langid.c

Pure C natural language identifier with support for 97 languages
C
24
star
6

geniatagger

- part-of-speech tagging, shallow parsing, and named entity recognition for biomedical text -
C++
22
star
7

kaggle-stumbleupon2013

My entry to the Kaggle 2013 StumbleUpon competition. Ranked 4th on the final private leaderboard.
Python
15
star
8

langid.js

An off-the-shelf client-side language identification module for JavaScript.
JavaScript
14
star
9

imgevolve

Evolve images from sets of triangles.
Python
7
star
10

updatedir

Rsync-like directory updating over multiple protocols
Python
3
star
11

daifugo

Simulation system for the japanese card game Daifugo.
Python
2
star
12

linguini.py

linguini.py is a pure-Python implementation of linguini, a vector-space model language identifier with support for bilingual and trilingual documents.
Python
2
star
13

forum_features

Data model for manipulating forum data.
Python
1
star
14

assignmentprint

Pretty printer for student-submitted assignments. Helps with prettyprinting student code and generating reports.
Python
1
star
15

language_data

Pythonic interface to natural language metadata
ASP
1
star