AutoML Alex
State-of-the art Automated Machine Learning python library for Tabular Data
Works with Tasks:
-
Binary Classification
-
Regression
-
Multiclass Classification (in progress...)
Benchmark Results
The bigger, the better
From AutoML-Benchmark
Scheme
Features
- Automated Data Clean (Auto Clean)
- Automated Feature Engineering (Auto FE)
- Smart Hyperparameter Optimization (HPO)
- Feature Generation
- Feature Selection
- Models Selection
- Cross Validation
- Optimization Timelimit and EarlyStoping
- Save and Load (Predict new data)
Installation
pip install automl-alex
Docs
π Examples
Classifier:
from automl_alex import AutoMLClassifier
model = AutoMLClassifier()
model.fit(X_train, y_train, timeout=600)
predicts = model.predict(X_test)
Regression:
from automl_alex import AutoMLRegressor
model = AutoMLRegressor()
model.fit(X_train, y_train, timeout=600)
predicts = model.predict(X_test)
DataPrepare:
from automl_alex import DataPrepare
de = DataPrepare()
X_train = de.fit_transform(X_train)
X_test = de.transform(X_test)
Simple Models Wrapper:
from automl_alex import LightGBMClassifier
model = LightGBMClassifier()
model.fit(X_train, y_train)
predicts = model.predict_proba(X_test)
model.opt(X_train, y_train,
timeout=600, # optimization time in seconds,
)
predicts = model.predict_proba(X_test)
More examples in the folder ./examples:
- 01_Quick_Start.ipynb
- 02_Data_Cleaning_and_Encoding_(DataPrepare).ipynb
- 03_Models.ipynb
- 04_ModelsReview.ipynb
- 05_BestSingleModel.ipynb
- Production Docker template
What's inside
It integrates many popular frameworks:
- scikit-learn
- XGBoost
- LightGBM
- CatBoost
- Optuna
- ...
Works with Features
-
Categorical Features
-
Numerical Features
-
Binary Features
-
Text
-
Datetime
-
Timeseries
-
Image
Note
- With a large dataset, a lot of memory is required! Library creates many new features. If you have a large dataset with a large number of features (more than 100), you may need a lot of memory.
Realtime Dashboard
Works with optuna-dashboard
Run
$ optuna-dashboard sqlite:///db.sqlite3
Road Map
-
Feature Generation
-
Save/Load and Predict on New Samples
-
Advanced Logging
-
Add opt Pruners
-
Docs Site
-
DL Encoders
-
Add More libs (NNs)
-
Multiclass Classification
-
Build pipelines