• Stars
    star
    274
  • Rank 150,274 (Top 3 %)
  • Language
  • License
    MIT License
  • Created over 5 years ago
  • Updated almost 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

This repository contains the collection of UCI (real-life) datasets and Synthetic (artificial) datasets (with cluster labels and MATLAB files) ready to use with clustering algorithms.

Last Commit Stars Badge Forks Badge Size Pull Requests Badge Issues Badge MIT License Star Badge

Clustering-Datasets

This repository contains the collection of UCI (real-life)datasets and Synthetic (artificial) datasets(with cluster labels).

Artificial data

2d-10c 2d-20c-no0 2d-3c-no123 2d-4c-no4 2d-4c-no9 2d-4c 2sp2glob 3-spiral 3MC D31 DS577 DS850 R15 aggregation atom banana birch-rg1 birch-rg2 birch-rg3 chainlink cluto-t4.8k cluto-t5.8k cluto-t7.10k cluto-t8.8k complex8 complex9 compound cure-t0-2000n-2D cure-t1-2000n-2D cure-t2-4k curves1 curves2 dartboard1 dartboard2 dense-disk-3000 dense-disk-5000 diamond9 disk-1000n disk-3000n disk-4000n disk-4500n disk-4600n disk-5000n disk-6000n donut1 donut2 donut3 donutcurves ds2c2sc13 ds3c3sc6 ds4c2sc8 elliptical_10_2 elly-2d10c13s engytime flame fourty golfball hepta insect jain long1 long2 long3 longsquare lsun mopsi-finland mopsi-joensuu pathbased rings s-set1 s-set2 s-set3 s-set4 sizes1 sizes2 sizes3 sizes4 sizes5 smile1 smile2 smile3 spherical_4_3 spherical_5_2 spherical_6_2 spiral spiralsquare square1 square2 square3 square4 square5 st900 target tetra triangle1 triangle2 twenty twodiamonds wingnut xclara zelnik1 zelnik2 zelnik3 zelnik4 zelnik5 zelnik6


Frequently asked questions ❔

How can I thank you for creating and sharing this repository? 🌷

You can Star Badge and Fork Badge Starring and Forking is free for you, but it tells me and other people that it was helpful and you like this tutorial.

Go here if you aren't here already and click ➞ ✰ Star and ⵖ Fork button in the top right corner. You will be asked to create a GitHub account if you don't already have one.


How can I use these datasets without an Internet connection? GIF

  1. Go here and click the big green ➞ Code button in the top right of the page, then click ➞ Download ZIP.

    Download ZIP

  2. Extract the ZIP and open it. Unfortunately I don't have any more specific instructions because how exactly this is done depends on which operating system you run.

If you have git and you know how to use it, you can also clone the repository instead of downloading a zip and extracting it. An advantage with doing it this way is that you don't need to download the whole tutorial again to get the latest version of it, all you need to do is to pull with git and run ipython notebook again.


Authors ✍️

I'm Dr. Milaan Parmar and I have written this tutorial. If you think you can add/correct/edit and enhance this tutorial you are most welcome🙏

See github's contributors page for details.

If you have trouble with this tutorial please tell me about it by Create an issue on GitHub. and I'll make this tutorial better. This is probably the best choice if you had trouble following the tutorial, and something in it should be explained better. You will be asked to create a GitHub account if you don't already have one.

If you like this tutorial, please give it a ⭐ star.


Licence 📜

You may use this tutorial freely at your own risk. See LICENSE.

More Repositories

1

93_Python_Data_Analytics_Projects

This repository contains all the data analytics projects that I've worked on in python.
Jupyter Notebook
491
star
2

01_Python_Introduction

Learn the basics of Python. These tutorials are for Python beginners. so even if you have no prior knowledge of Python, you won’t face any difficulty understanding these tutorials.
Jupyter Notebook
319
star
3

91_Python_Mini_Projects

Jupyter Notebook
304
star
4

06_Python_Object_Class

Object-oriented programming (OOP) is a method of structuring a program by bundling related properties and behaviors into individual objects. In this tutorial, you’ll learn the basics of object-oriented programming in Python.
Jupyter Notebook
293
star
5

07_Python_Advanced_Topics

You'll learn about Iterators, Generators, Closure, Decorators, Property, and RegEx in detail with examples.
Jupyter Notebook
288
star
6

90_Python_Examples

The best way to learn Python is by practicing examples. The repository contains examples of basic concepts of Python. You are advised to take the references from these examples and try them on your own.
Jupyter Notebook
288
star
7

Python_Decision_Tree_and_Random_Forest

I've demonstrated the working of the decision tree-based ID3 algorithm. Use an appropriate data set for building the decision tree and apply this knowledge to classify a new sample. All the steps have been explained in detail with graphics for better understanding.
Jupyter Notebook
251
star
8

10_Python_Pandas_Module

Pandas is a high-level data manipulation tool developed by Wes McKinney. It is built on the Numpy package and its key data structure is called the DataFrame. DataFrames allow you to store and manipulate tabular data in rows of observations and columns of variables.
Jupyter Notebook
237
star
9

02_Python_Datatypes

Data types specify the different sizes and values that can be stored in the variable. For example, Python stores numbers, strings, and a list of values using different data types. Learn different types of Python data types along with their respective in-built functions and methods.
Jupyter Notebook
235
star
10

Python_Computer_Vision_from_Scratch

This repository explores the variety of techniques commonly used to analyze and interpret images. It also describes challenging real-world applications where vision is being successfully used, both for specialized applications such as medical imaging, and for fun, consumer-level tasks such as image editing and stitching, which students can apply to their own personal photos and videos.
Jupyter Notebook
234
star
11

04_Python_Functions

The function is a block of code defined with a name. We use functions whenever we need to perform the same task multiple times without writing the same code again. It can take arguments and returns the value.
Jupyter Notebook
231
star
12

09_Python_NumPy_Module

Numpy is a general-purpose array-processing package. It provides a high-performance multidimensional array object and tools for working with these arrays. It is the fundamental package for scientific computing with Python. Besides its obvious scientific uses, Numpy can also be used as an efficient multi-dimensional container of generic data.
Jupyter Notebook
230
star
13

08_Python_Date_Time_Module

Time is undoubtedly the most critical factor in every aspect of life. Therefore, it becomes very essential to record and track this component. In Python, date and time can be tracked through its built-in libraries. This article on Date and time in Python will help you understand how to find and modify the dates and time using the time and datetime modules.
Jupyter Notebook
225
star
14

05_Python_Files

Python too supports file handling and allows users to handle files i.e., to read and write files, along with many other file handling options, to operate on files. The concept of file handling has stretched over various other languages, but the implementation is either complicated or lengthy, but like other concepts of Python, this concept here is also easy and short. Python treats files differently as text or binary and this is important.
Jupyter Notebook
225
star
15

LaTeX4Everyone

Learn LaTeX from scratch in an easy-to-follow but highly effective way. Get up to the level of professional document writeup, presentation creation and even generating graphics and figures in LaTeX.
TeX
223
star
16

03_Python_Flow_Control

Flow control is the order in which statements or blocks of code are executed at runtime based on a condition. Learn Conditional statements, Iterative statements, and Transfer statements
Jupyter Notebook
222
star
17

11_Python_Matplotlib_Module

Matplotlib is an amazing visualization library in Python for 2D plots of arrays. Matplotlib is a multi-platform data visualization library built on NumPy arrays and designed to work with the broader SciPy stack. It was introduced by John Hunter in the year 2002. One of the greatest benefits of visualization is that it allows us visual access to huge amounts of data in easily digestible visuals. Matplotlib consists of several plots like line, bar, scatter, histogram, etc
Jupyter Notebook
219
star
18

12_Python_Seaborn_Module

Seaborn is one of the go-to tools for statistical data visualization in python. It has been actively developed since 2012 and in July 2018, the author released version 0.9. This version of Seaborn has several new plotting features, API changes and documentation updates which combine to enhance an already great library. This article will walk through a few of the highlights and show how to use the new scatter and line plot functions for quickly creating very useful visualizations of data.
Jupyter Notebook
218
star
19

DataScience_Interview_Questions

My Solutions to 120 commonly asked data science interview questions.
Jupyter Notebook
210
star
20

milaan9

Python
195
star
21

Clustering_Algorithms_from_Scratch

Implementing Clustering Algorithms from scratch in MATLAB and Python
Jupyter Notebook
194
star
22

Python_Natural_Language_Processing

This repository consists of a complete guide on natural language processing (NLP) in Python where we'll learn various techniques for implementing NLP including parsing & text processing and understand how to use NLP for text feature engineering.
Jupyter Notebook
191
star
23

92_Python_Games

This repository contains Python games that I've worked on. You'll learn how to create python games with AI. I try to focus on creating board games without GUI in Jupyter-notebook.
Jupyter Notebook
190
star
24

Machine_Learning_Algorithms_from_Scratch

This repository explores the variety of techniques and algorithms commonly used in machine learning and the implementation in MATLAB and PYTHON.
Jupyter Notebook
185
star
25

Deep_Learning_Algorithms_from_Scratch

This repository explores the variety of techniques and algorithms commonly used in deep learning and the implementation in MATLAB and PYTHON
Jupyter Notebook
169
star
26

13_Python_scikit-learn_Module

27
star
27

JLUFE_Intelligent_Tech_2005-2006

Jupyter Notebook
22
star
28

JLUFE-Python-Statistical-Analysis-Modeling_52192-62193

Jupyter Notebook
21
star
29

Python_Data_Science_Feature_Selection_Tutorials

An introduction to feature selection in data science using Python (NumPy, Pandas, Scikit-learn) with Jupyter notebooks.
12
star
30

94_Computer_Vision_Projects

Jupyter Notebook
10
star
31

Residual_Error_based_Clustering_Algorithms

9
star
32

TIL

Python
7
star
33

cSharp_Programming

3
star
34

Website

1
star