Top Rating
- Top Contributors
  Discover the Top Open Source contributors by country or by language
- Interviews
  Discover real stories from Open Source developers
Discover

Discover your Favorite Language
Discover the top trending repositories and projects on Github. Explore the latest trends in your preferred languages.

Python

Go

Rust

Julia

Shell

Kotlin

Swift

Zig

More Languages
Awesome

Awesome repositories
Discover the most awesome repositories and projects of your favorite languages. Inspired by the Awesome-* lists trend in GitHub.

Clojure

Elixir

Erlang

Groovy

Elm

PHP

Crystal

Zig

More Languages
By Country

Rankings by Country
Discover the community of talented open source contributors in each country.

🇬🇪 Georgia

🇱🇻 Latvia

🇹🇯 Tajikistan

🇯🇵 Japan

🇦🇬 Antigua and Barbuda

🇹🇩 Chad

🇹🇭 Thailand

🇬🇾 Guyana

All Countries Compare Countries

kjahan/k_means

Stars
167
Rank 226,635 (Top 5 %)
Language
Python
Created over 12 years ago
Updated over 1 year ago

kjahan/k_means

kjahan

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

A Python implementation of k-means clustering algorithm

K-Means

General description

This project is a Python implementation of k-means clustering algorithm.

Requirements

You should setup the conda environment (i.e. kmeans) using the environment.yml file:

conda env create -f environment.yml

Activate conda environment:

conda activate kmeans

(Run unset PYTHONPATH on Mac OS)

Input

A list of points in two-dimensional space where each point is represented by a latitude/longitude pair.

Output

The clusters of points. By default we stores the computed clusters into a csv file: output.csv. You can specify your output filename using --output argument option.

How to run:

python -m src.run --input YOUR_LOC_FILE --clusters CLUSTERS_NO

Note that the runner expects the location file be in data folder.

Run tests

python -m pytest tests/

Technical details

This project is an implementation of k-means algorithm. It starts with a random point and then chooses k-1 other points as the farthest from the previous ones successively. It uses these k points as cluster centroids and then joins each point of the input to the cluster with the closest centroid.

Next, it recomputes the new centroids by calculating the means of obtained clusters and repeats the first step again by finding which cluster each point belongs to.

The program repeats these two steps until the cluster centroids converge and do not change anymore. See the following link to read more about this project and see some real examples of running k-means algorithm:

K-Means algorithm description

To deactivate the conda environment:

conda deactivate

community

A Python implementation of Girvan-Newman algorithm

twitter_mining

Twitter Mining in Java

athena

flickr_crawler

Python crawler for pulling geo-tagged photos from Flickr

lda

Extracting Hidden Topics from Texts using LDA Model

four_knights_game

Four knights puzzle implemented for iOS devices (iPhone/iPad) using PhoneGap technology and KineticJS Javascript library.

distributed_crawler

Distributer crawler with RMQ

ipintelligence

IP Intelligence

bacon_number

Six degrees of separation

Jupyter Notebook

contact_predictor

Human Contact Predictor using Unsupervised Learning Algorithms implemented in Python.

twitter_crawler

A Twitter Crawler in Python.

youtube_crawler

Youtube Crawler

goclustering

similarity_search

Jupyter Notebook

twitter_intel

Twitter NLP intelligence

mab

Multi-armed bandit service

soma_cube

A Java implementation for solving 7-pieces Soma Cube puzzle.

text_recognition

A Text Recognition Algorithm implemented in Python.