Clustering benchmarks
Datasets
This project contains collection of labeled clustering problems that can be found in the literature. Most of datasets were artificially created.
The benchmark includes:
Artificial data
Experiments
This project contains set of clustering methods benchmarks on various dataset. The project is dependent on Clueminer project.
in order to run benchmark compile dependencies into a single JAR file:
mvn assembly:assembly
Consensus experiment
allows running repeated runs of the same algorithm:
./run consensus --dataset "triangle1" --repeat 10
by default k-means algorithm is used.
For available datasets see resources folder.