Feature Selection for Machine Learning - Code Repository
Published February, 2018
Actively maintained.
Links
Table of Contents
-
Basic Selection Methods
- Removing Constant Features
- Removing Quasi-Constant Features
- Removing Duplicated Features
-
Correlation Feature Selection
- Removing Correlated Features
- Basic Selection Methods + Correlation - Pipeline
-
Filter Methods: Univariate Statistical Methods
- Mutual Information
- Chi-square distribution
- Anova
- Basic Selection Methods + Statistical Methods - Pipeline
-
Filter Methods: Other Methods and Metrics
- Univariate roc-auc, mse, etc
- Method used in a KDD competition - 2009
-
Wrapper Methods
- Step Forward Feature Selection
- Step Backward Feature Selection
- Exhaustive Feature Selection
-
Embedded Methods: Linear Model Coefficients
- Logistic Regression Coefficients
- Linear Regression Coefficients
- Effect of Regularization on Coefficients
- Basic Selection Methods + Correlation + Embedded - Pipeline
-
Embedded Methods: Lasso
- Lasso
- Basic Selection Methods + Correlation + Lasso - Pipeline
-
Embedded Methods: Tree Importance
- Random Forest derived Feature Importance
- Tree importance + Recursive Feature Elimination
- Basic Selection Methods + Correlation + Tree importance - Pipeline
-
Hybrid Feature Selection Methods
- Feature Shuffling
- Recursive Feature Elimination
- Recursive Feature Addition