Data Sets for Machine Learning Practice
Name | Description | Source |
---|---|---|
iris | 150 flowers (rows) belonging to the 3 species (setosais , versicolor and virginica) of the Iris genus. The dataset consists of 4 input variables (sepal length, sepal width, petal length and petal width) and 1 output variable (the class label of the Iris species as being setosais , versicolor and virginica. | 1 |
dhfr | 325 molecules (rows) with biological activity against the DHFR enzyme (an anti-marial drug target). The dataset consists of 228 input variables (molecular descriptors describing the physicochemical properties of the molecule) and 1 output variable (the biological activity as being either active or inactive). | 2 |
heart-disease-cleveland | 303 patients (rows) who have been diagnosed as having (diagnosis score of 1, 2, 3, or 4) or not having (diagnosis score of 0) heart disease. The dataset consists of 13 input variables (the health parameters) and 1 output variable (diagnosis). | 3 |