eXascale Infolab (@eXascaleInfolab)

Top repositories

1

PyExPool

Python Multi-Process Execution Pool: concurrent asynchronous execution pool with custom resource constraints (memory, timeouts, affinity, CPU cores and caching), load balancing and profiling capabilities of the external apps on NUMA architecture
Python
161
star
2

LFR-Benchmark_UndirWeightOvp

Extended version of the Lancichinetti-Fortunato-Radicchi Benchmark for Undirected Weighted Overlapping networks to evaluate clustering algorithms using generated ground-truth communities
C++
76
star
3

JUST

Python
45
star
4

LBSN2Vec

Code release for LBSN2Vec
C
44
star
5

Flashback_code

Python
44
star
6

HINGE_code

Python
37
star
7

NodeSketch

NodeSketch: Highly-Efficient Graph Embeddings via Recursive Sketching
Python
32
star
8

TRank

Ranking Entity Types using the Web of Data
Scala
30
star
9

bench-vldb20

C
30
star
10

RETA_code

Python
30
star
11

ActiveLink

Deep active learning framework for link prediction in knowledge graph
Python
25
star
12

HistoSketch

Implementation of HistoSketch and D2HistoSketch in MATLAB
MATLAB
20
star
13

clubmark

Clubmark: a Parallel Isolation Framework for Benchmarking and Profiling of Clustering (Community Detection) Algorithms Considering Overlaps (Covers)
Python
20
star
14

pytrec_eval

A library to evaluate TREC-like runs with TREC-like qrels. Implements similarity of rankings, ttest between runs etc…
Python
19
star
15

PyCABeM

Python Benchmarking Framework for the Clustering Algorithms Evaluation: networks generation and shuffling; failover execution and resource consumption tracing (peak RAM RSS, CPU, ...); evaluation of Modularity, conductance, NMI and F1 Score for overlapping communities
Python
19
star
16

GenConvNMI

Generalized Conventional Mutual Information (GenConvMI) - NMI for overlapping (soft, fuzzy) clusters (communities), compatible with standard NMI, pure C++ version (single executable)
C++
19
star
17

MARTA

Python
18
star
18

xmeasures

Extremely fast evaluation of the extrinsic clustering measures: various (mean) F1 measures and Omega Index (Fuzzy Adjusted Rand Index) for the multi-resolution clustering with overlaps/covers, standard NMI, clusters labeling
C++
16
star
19

TSM-Bench

Comprehensive Benchmark for Time Series Database Systems
Jupyter Notebook
11
star
20

fashion_nlp_v2

FashionBrain D2.1: Named Entity Recognition and Linking Methods
Python
9
star
21

orbits

C#
6
star
22

PyNetConvert

Network (Graph) Format Converter: RCG, Pajek, Metis, NSL (NCol, SNAP, ...), Mathlab
Python
6
star
23

StaTIX

Statistical Type Inference (both fully automatic and semi supervised) for RDF datasets
Java
6
star
24

daoc

DAOC (Deterministic and Agglomerative Overlapping Clustering algorithm): Stable Clustering of Large Networks
C++
6
star
25

GraphEmbEval

Graph (network) embeddings evaluation framework via classification, gram martix construction for links prediction
Python
6
star
26

fashionNLP

Python
5
star
27

pSCAN

pSCAN: Fast and Exact Structural Graph Clustering (with overlaps)
C
5
star
28

sanaphor

Python
5
star
29

2018-Internship-TableDetection

This repository contains the pipeline for table detection/extraction from 'Bundesarchive' documents.
HTML
5
star
30

Wiki2Prop

The companion material for the Wiki2Prop Paper
Python
5
star
31

OpenCrowd

Python
4
star
32

WDCFramework

clone of https://www.assembla.com/spaces/commondata/subversion/source/HEAD/WDCFramework/trunk
Java
4
star
33

daor

DAOR Parameter-free Embedding Framework for Large Graphs (Networks)
C++
4
star
34

CORAD

CORAD: Correlation-Aware Compression of Massive Time Series using Sparse Dictionary Coding
Python
4
star
35

cardinal

Source Code and Companion Material of the Non-Parametric Class Completeness Estimators
Python
3
star
36

TaxoComplete

his is the repositotry of TaxoComplete: Self-Supervised Taxonomy Completion Leveraging Position-Enhanced Semantic Matching
Python
3
star
37

entity-disambiguation-data-ecir2013

3
star
38

2016-armatweet

NLP components of ArmaTweet devoted to converting tweets into quads of the form (`subject`, `predicate`, `object`, `location`) where `subject`, `object`, and `location` are DBpedia resources, and `predicate` is a WordNet synset.
Scala
3
star
39

axel

Project for exploratory search on scientific articles
Python
3
star
40

thesis_template

Latex template for XI BSc/MSc thesis
TeX
3
star
41

hirecs

High Resolution Hierarchical Clustering with Stable State
C++
3
star
42

NetHash

NetHash algorithm from IJCAI 2018
C++
3
star
43

typhon

Deep Learning framework that trains a single model using multiple, heterogeneous datasets leveraging parallel transfer, strictly enforcing feature generalization and even preventing overfitting
Python
3
star
44

inFlux

Task Flow Control
JavaScript
2
star
45

wd-graph

A toolset to work with the Wikidata Graph
Python
2
star
46

WDCTools

Scala
2
star
47

timesvd_vc

Python
2
star
48

preposition-data-cikm2014

Datasets with preposition corrections for CIKM 2014 paper
2
star
49

SNF_disambiguation

2
star
50

vadetis

Jupyter Notebook
2
star
51

pgpr

Python
2
star
52

seer

CSS
2
star
53

resmerge

Resolution levels clustering merger with filtering and clusters deduplication. Flattens a hierarchy/list of multiple resolutions levels (clusterings) into the single flat clustering (collection), synchronizing the node base and deduplicating.
C++
2
star
54

typhon_exp

Experiments for the paper: "Typhon: Parallel Transfer on Heterogeneous Datasets for Cancer Detection in Computer-Aided Diagnosis"
Python
1
star
55

cdrec

C++
1
star
56

ase-lab

Lab of Time Series Database Systems
Python
1
star
57

ReVival-Code

PHP
1
star
58

nif-entity-linking-webservice

JavaScript
1
star
59

interval_index

A full-set of data structures and experimental data for CINTIA paper
C++
1
star
60

CDTool

C++
1
star
61

bench-vldb20_full

C
1
star
62

2019_kais-bench

AGS Script
1
star
63

oslom2

Sources of the OSLOM2 (v2.5) clustering algorithm with slightly extended I/O for the benchmarking under Clubmark
C++
1
star
64

tag-recommendation-data-iswc2012

Dataset for the " Tag recommendation" paper from ISWC 2012
1
star
65

scientific_NER_dataset

Judged dataset for NER in scientific documents
1
star
66

scala_utils

Few Scala utils...
Java
1
star
67

CGGC

RG (Randomized Greedy clustering), CGGC_RG (Core Groups Graph ensemble Clustering) or CGGCi_RG (Core Groups Graph ensemble Clustering Iterative) algorithms
C++
1
star
68

BonusBar

BonusBar Django project. An HCI prototype for worker retention.
JavaScript
1
star
69

HIT-Scheduler

Opensource, HIT Scheduling backend for Amazon Mechanical Turk.
JavaScript
1
star
70

libMoji

The implementation of Moji Visualizations
JavaScript
1
star
71

WikidataSectionLinks

Python
1
star
72

JOINER_code

C++
1
star
73

TInfES

Type Inference Evaluation Scripts & Accessory Apps (used for the StaTIX benchmarking)
Python
1
star
74

sds2020_web_table_annotation

SDS2020 - Annotating Web Tables through Knowledge Bases: A Context-Based Approach
Python
1
star
75

Wikipedia30

A collections of 30 random Wikipedia pages manually annotated with entities.
1
star
76

SMA-17s_CommunityDetection

Community detection programming exercises for the SMA-17s course
Jupyter Notebook
1
star
77

ASE-lab-2023

Time Series Database System Lab 2023
1
star