• Stars
    star
    19
  • Rank 1,163,249 (Top 23 %)
  • Language
    Python
  • License
    Other
  • Created over 9 years ago
  • Updated almost 6 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Python Benchmarking Framework for the Clustering Algorithms Evaluation: networks generation and shuffling; failover execution and resource consumption tracing (peak RAM RSS, CPU, ...); evaluation of Modularity, conductance, NMI and F1 Score for overlapping communities

More Repositories

1

PyExPool

Python Multi-Process Execution Pool: concurrent asynchronous execution pool with custom resource constraints (memory, timeouts, affinity, CPU cores and caching), load balancing and profiling capabilities of the external apps on NUMA architecture
Python
163
star
2

LFR-Benchmark_UndirWeightOvp

Extended version of the Lancichinetti-Fortunato-Radicchi Benchmark for Undirected Weighted Overlapping networks to evaluate clustering algorithms using generated ground-truth communities
C++
79
star
3

Flashback_code

Python
45
star
4

JUST

Python
45
star
5

LBSN2Vec

Code release for LBSN2Vec
C
44
star
6

HINGE_code

Python
38
star
7

bench-vldb20

C
35
star
8

NodeSketch

NodeSketch: Highly-Efficient Graph Embeddings via Recursive Sketching
Python
33
star
9

TRank

Ranking Entity Types using the Web of Data
Scala
30
star
10

RETA_code

Python
30
star
11

ActiveLink

Deep active learning framework for link prediction in knowledge graph
Python
24
star
12

HistoSketch

Implementation of HistoSketch and D2HistoSketch in MATLAB
MATLAB
20
star
13

GenConvNMI

Generalized Conventional Mutual Information (GenConvMI) - NMI for overlapping (soft, fuzzy) clusters (communities), compatible with standard NMI, pure C++ version (single executable)
C++
20
star
14

clubmark

Clubmark: a Parallel Isolation Framework for Benchmarking and Profiling of Clustering (Community Detection) Algorithms Considering Overlaps (Covers)
Python
20
star
15

pytrec_eval

A library to evaluate TREC-like runs with TREC-like qrels. Implements similarity of rankings, ttest between runs etc…
Python
19
star
16

MARTA

Python
18
star
17

xmeasures

Extremely fast evaluation of the extrinsic clustering measures: various (mean) F1 measures and Omega Index (Fuzzy Adjusted Rand Index) for the multi-resolution clustering with overlaps/covers, standard NMI, clusters labeling
C++
18
star
18

TSM-Bench

Comprehensive Benchmark for Time Series Database Systems
Jupyter Notebook
15
star
19

fashion_nlp_v2

FashionBrain D2.1: Named Entity Recognition and Linking Methods
Python
11
star
20

fashionNLP

Python
6
star
21

orbits

C#
6
star
22

PyNetConvert

Network (Graph) Format Converter: RCG, Pajek, Metis, NSL (NCol, SNAP, ...), Mathlab
Python
6
star
23

daoc

DAOC (Deterministic and Agglomerative Overlapping Clustering algorithm): Stable Clustering of Large Networks
C++
6
star
24

StaTIX

Statistical Type Inference (both fully automatic and semi supervised) for RDF datasets
Java
6
star
25

GraphEmbEval

Graph (network) embeddings evaluation framework via classification, gram martix construction for links prediction
Python
6
star
26

pSCAN

pSCAN: Fast and Exact Structural Graph Clustering (with overlaps)
C
5
star
27

TaxoComplete

his is the repositotry of TaxoComplete: Self-Supervised Taxonomy Completion Leveraging Position-Enhanced Semantic Matching
Python
5
star
28

sanaphor

Python
5
star
29

2018-Internship-TableDetection

This repository contains the pipeline for table detection/extraction from 'Bundesarchive' documents.
HTML
5
star
30

CORAD

CORAD: Correlation-Aware Compression of Massive Time Series using Sparse Dictionary Coding
Python
5
star
31

Wiki2Prop

The companion material for the Wiki2Prop Paper
Python
5
star
32

OpenCrowd

Python
4
star
33

WDCFramework

clone of https://www.assembla.com/spaces/commondata/subversion/source/HEAD/WDCFramework/trunk
Java
4
star
34

daor

DAOR Parameter-free Embedding Framework for Large Graphs (Networks)
C++
4
star
35

cardinal

Source Code and Companion Material of the Non-Parametric Class Completeness Estimators
Python
3
star
36

entity-disambiguation-data-ecir2013

3
star
37

2016-armatweet

NLP components of ArmaTweet devoted to converting tweets into quads of the form (`subject`, `predicate`, `object`, `location`) where `subject`, `object`, and `location` are DBpedia resources, and `predicate` is a WordNet synset.
Scala
3
star
38

axel

Project for exploratory search on scientific articles
Python
3
star
39

thesis_template

Latex template for XI BSc/MSc thesis
TeX
3
star
40

hirecs

High Resolution Hierarchical Clustering with Stable State
C++
3
star
41

seer

CSS
3
star
42

NetHash

NetHash algorithm from IJCAI 2018
C++
3
star
43

typhon

Deep Learning framework that trains a single model using multiple, heterogeneous datasets leveraging parallel transfer, strictly enforcing feature generalization and even preventing overfitting
Python
3
star
44

inFlux

Task Flow Control
JavaScript
2
star
45

wd-graph

A toolset to work with the Wikidata Graph
Python
2
star
46

WDCTools

Scala
2
star
47

timesvd_vc

Python
2
star
48

SNF_disambiguation

2
star
49

Event-Detection-Twitter

This is the repository for data related to our submission to TKDE titled "Event Detection on Microposts: a Comparison of Four Approaches".
2
star
50

vadetis

Jupyter Notebook
2
star
51

pgpr

Python
2
star
52

preposition-data-cikm2014

Datasets with preposition corrections for CIKM 2014 paper
2
star
53

resmerge

Resolution levels clustering merger with filtering and clusters deduplication. Flattens a hierarchy/list of multiple resolutions levels (clusterings) into the single flat clustering (collection), synchronizing the node base and deduplicating.
C++
2
star
54

typhon_exp

Experiments for the paper: "Typhon: Parallel Transfer on Heterogeneous Datasets for Cancer Detection in Computer-Aided Diagnosis"
Python
1
star
55

cdrec

C++
1
star
56

ase-lab

Lab of Time Series Database Systems
Python
1
star
57

scala_utils

Few Scala utils...
Java
1
star
58

ReVival-Code

PHP
1
star
59

nif-entity-linking-webservice

JavaScript
1
star
60

interval_index

A full-set of data structures and experimental data for CINTIA paper
C++
1
star
61

CDTool

C++
1
star
62

bench-vldb20_full

C
1
star
63

oslom2

Sources of the OSLOM2 (v2.5) clustering algorithm with slightly extended I/O for the benchmarking under Clubmark
C++
1
star
64

tag-recommendation-data-iswc2012

Dataset for the " Tag recommendation" paper from ISWC 2012
1
star
65

scientific_NER_dataset

Judged dataset for NER in scientific documents
1
star
66

2019_kais-bench

AGS Script
1
star
67

BonusBar

BonusBar Django project. An HCI prototype for worker retention.
JavaScript
1
star
68

libMoji

The implementation of Moji Visualizations
JavaScript
1
star
69

WikidataSectionLinks

Python
1
star
70

TInfES

Type Inference Evaluation Scripts & Accessory Apps (used for the StaTIX benchmarking)
Python
1
star
71

JOINER_code

C++
1
star
72

sds2020_web_table_annotation

SDS2020 - Annotating Web Tables through Knowledge Bases: A Context-Based Approach
Python
1
star
73

Wikipedia30

A collections of 30 random Wikipedia pages manually annotated with entities.
1
star
74

HIT-Scheduler

Opensource, HIT Scheduling backend for Amazon Mechanical Turk.
JavaScript
1
star
75

SMA-17s_CommunityDetection

Community detection programming exercises for the SMA-17s course
Jupyter Notebook
1
star
76

ASE-lab-2023

Time Series Database System Lab 2023
1
star
77

CGGC

RG (Randomized Greedy clustering), CGGC_RG (Core Groups Graph ensemble Clustering) or CGGCi_RG (Core Groups Graph ensemble Clustering Iterative) algorithms
C++
1
star