This repository contains slides and documented R examples to accompany several chapters of the popular data mining text book:
Pang-Ning Tan, Michael Steinbach, Anuj Karpatne and Vipin Kumar, Introduction to Data Mining, Addison Wesley, 1st or 2nd edition.
The slides and examples are used in my course CS 7331 - Data Mining taught at SMU and will be regularly updated and improved. The code examples are now compiled into the free online book
An R Companion for Introduction to Data Mining which is
published under the creative commons attribution license and you can
share and adapt them freely. Please open an issue
for corrections or to suggest improvements.
Chapter | Slides | R Code Companion | Sample Textbook Chapters |
---|---|---|---|
1. Introduction | Slides | R Code | |
2. Data | Slides: Data, Exploration | R Code | |
3. Classification: Basic Concepts and Techniques | Slides | R Code | Read Chapter 3 |
4. Classification: Alternative Techniques | Slides | R Code | |
5. Association Analysis: Basic Concepts and Algorithms | Slides | R Code | Read Chapter 5 |
7. Cluster Analysis: Basic Concepts and Algorithms | Slides | R Code | Read Chapter 7 |
- Powerpoint presentation files for a data mining course can be found in the repository directory slides. The slides have an R symbol at the bottom whenever there are R code examples available.
- Datasets for projects: Datasets can be found at https://www.kaggle.com/datasets
- More instructional material can be found on the course web site of CS 7331 - Data Mining
All code and documents in this repository are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
For questions please contact Michael Hahsler.