Dedupe.io (@dedupeio)
  • Stars
    star
    5,242
  • Global Org. Rank 4,380 (Top 2 %)
  • Registered over 7 years ago
  • Most used languages
    Python
    92.0 %
    C++
    4.0 %
    Cython
    4.0 %
  • Location πŸ‡ΊπŸ‡Έ United States
  • Country Total Rank 2,106
  • Country Ranking
    Cython
    21
    Python
    302
    C++
    8,333

Top repositories

1

dedupe

πŸ†” A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
Python
4,080
star
2

csvdedupe

πŸ†” Command line tool for deduplicating CSV files
Python
409
star
3

dedupe-examples

πŸ†” Examples for using the dedupe library
Python
403
star
4

address-matching

Python script for matching a list of messy addresses against a gazetteer using dedupe.
Python
60
star
5

affinegap

πŸ“ A Cython implementation of the affine gap string distance
Cython
58
star
6

hcluster

Hierarchical Clustering Algorithms
Python
35
star
7

dedupe-geocoder

πŸ“ Demonstration of how dedupe might be used as geocoder
Python
17
star
8

doublemetaphone

πŸ”‰ Python wrapper for a C++ Double Metaphone
C++
15
star
9

fuzzycategory

πŸ“ Fuzzy Categorical Distances
Python
14
star
10

rlr

Regularized Logistic Regression
Python
11
star
11

dedupe-variable-address

Address Variable Type for dedupe
Python
9
star
12

dedupe-variable-person

Dedupe variable for person names. just people. no companies.
Python
9
star
13

dedupe-variable-name

name variable type for dedupe
Python
8
star
14

soft-tfidf

Mispelling tolerant tf-idf similarity metric
6
star
15

highered

CRF Edit Distance
Python
6
star
16

dedupeio-web-api-docs

Dedupe.io web API allows for matching and training against projects using a standard RESTful framework.
Python
6
star
17

dedupe-variable-employer

Python
5
star
18

dedupe-vowpal

Vowpal Wabbit Active Labeler for Dedupe
Python
4
star
19

dedupe-variable-datetime

DateTime variable for dedupe
Python
4
star
20

dedupe-variable-fuzzycategory

Dedupe Variable for Fuzzy Categories
Python
4
star
21

categorical-distance

πŸ“ Compare categorical variables
Python
4
star
22

parseratorvariable

Base class for dedupe variables for parsed fields
Python
3
star
23

simplecosine

πŸ“ simple cosine distance
Python
3
star
24

dedupe-variable-number

Try to cast strings to numbers, then compare
Python
3
star
25

datetime-distance

Β πŸ“ Compare dates and times
Python
3
star
26

dedupe-variable-ilcs

Dedupe variable for Illinois Compiled Statute (ILCS) codes
Python
2
star