advanced_training
Advanced Scikit-learn training session
Outline
1 Basic algorithms
- Review of supervised learning
- Linear models for classification and regression
- Loss functions, regularization, empirical risk minimization
- Path algorithms
- Exercise: FIXME Regression
2 Basic tools
- Cross-validation vs train/test split
- GridSearchCV
- Overfitting Parameters
- Scoring Metrics
- Exercise: FIXME
3 Preprocessing
-
Scaling and normalization
-
Feature selection:
- Univariate
- Model-based
- RFE
- Forward / backward selection
-
Polynomial and interaction features
-
Exercise: FIXME
4 Advanced tools
- Pipelines
- FeatureUnion
- Function Transformer?
- Exercise: FIXME
5 Advanced Supervised Learning
- Decision Tree Recap
- Random Forests
- Gradient Boosting / xgboost
- Kernel SVMs
- Kernel approximation
- Neural Networks
- Exercise: FIXME
6 Unsupervised feature extraction and visualization
- PCA
- NMF
- Robust PCA?
- TSNE
- Exercise: FIXME
7 Outlier Detection
- Elliptic Envelope?
- IForest ?
- What else?
- KDE?
- SVM?
- robust PCA?
- Exercise: FIXME
8 Gaussian Processes
- Non-iid data
- Gaussian fit...
- Covariance matrix is a kernel
- regression, outlier detection, time series modelling
- Exercise: FIXME
9 More Neural Networks
10 beyond standard sklearn
- warm starts
- out of core
- custom estimators