Decision Trees - An Introduction
Abstract
This project work emerges in the context of the course Artificial Intelligence in the winter semester 2013/2014 at Friedrich-Alexander-University, Erlangen. Beside this seminar paper, an introductory presentation was conducted and an implementation for decision tree was developed. The presentation is available only in German.
In the scope of this seminar paper, a small introduction to theory and application of decision trees shall be given.
After this short introduction a theoretical consideration shall guide to a practical part, which shall clarify the theoretical part by examples. The last part shall summarize and compare the introduced algorithm and shall give a small outlook to not tackled research fields of decision trees.
On the contrary to the presentation during the seminar, this seminar paper expects a basic knowledge about graph theory, complexity, and machine learning. Instead of an introduction to these underlaying topics, a deeper look inside four decision tree algorithm families shall be given: CHAID, CART, ID3, and C4.5.
The focus of all Python implementation is on classification. This limitation is not owed to the insufficient importance of regression calculating, but a wider look would push boundaries of this seminar paper.
Table of Content
- Introduction
- What is a decision tree?
- Taxonomy
- About this paper
- Theory of Decision Trees
- Definitions
- Decision Tree Learning
- Splitting Criterion
- Stopping Criterion
- Tree Pruning
- Selected Algorithms
- Chi-squared Automatic Interaction Detector (CHAID)
- IterativeDichotomiser 3 (ID3)
- Classification And Regression Tree (CART)
- C4.5
- Discussion
- Advantages
- Disadvantages
- Outlook
- Complexity
- Missing Attributes
- Random Forests
- Summary & Conclusion
- Applications
- Programming Example
- Summary