• Stars
    star
    635
  • Rank 70,829 (Top 2 %)
  • Language
    Python
  • License
    BSD 3-Clause "New...
  • Created almost 5 years ago
  • Updated about 1 month ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Automated modeling and machine learning framework FEDOT

Logo of FEDOT framework

package
tests
docs Documentation Status
license
Supported Python Versions
stats
downloads_stats
support
languages
mirror
GitLab mirror for this repository
funding

FEDOT is an open-source framework for automated modeling and machine learning (AutoML) problems. This framework is distributed under the 3-Clause BSD license.

It provides automatic generative design of machine learning pipelines for various real-world problems. The core of FEDOT is based on an evolutionary approach and supports classification (binary and multiclass), regression, clustering, and time series prediction problems.

The structure of the AutoML workflow in FEDOT

The key feature of the framework is the complex management of interactions between various blocks of pipelines. It is represented as a graph that defines connections between data preprocessing and model blocks.

The project is maintained by the research team of the Natural Systems Simulation Lab, which is a part of the National Center for Cognitive Research of ITMO University.

More details about FEDOT are available in the next video:

Introducing Fedot

FEDOT concepts

Installation

The simplest way to install FEDOT is using pip:

$ pip install fedot

Installation with optional dependencies for image and text processing, and for DNNs:

$ pip install fedot[extra]

How to Use

FEDOT provides a high-level API that allows you to use its capabilities in a simple way. The API can be used for classification, regression, and time series forecasting problems.

To use the API, follow these steps:

  1. Import Fedot class
from fedot.api.main import Fedot
  1. Initialize the Fedot object and define the type of modeling problem. It provides a fit/predict interface:
  • Fedot.fit() begins the optimization and returns the resulting composite pipeline;
  • Fedot.predict() predicts target values for the given input data using an already fitted pipeline;
  • Fedot.get_metrics() estimates the quality of predictions using selected metrics.

NumPy arrays, Pandas DataFrames, and the file's path can be used as sources of input data. In the case below, x_train, y_train and x_test are numpy.ndarray():

model = Fedot(problem='classification', timeout=5, preset='best_quality', n_jobs=-1)
model.fit(features=x_train, target=y_train)
prediction = model.predict(features=x_test)
metrics = model.get_metrics(target=y_test)

More information about the API is available in the documentation section and advanced approaches are in the advanced section.

Examples & Tutorials

Jupyter notebooks with tutorials are located in examples repository. There you can find the following guides:

Notebooks are issued with the corresponding release versions (the default version is 'latest').

Also, external examples are available:

Extended examples:

Also, several video tutorials are available available (in Russian).

Publications About FEDOT

We also published several posts devoted to different aspects of the framework:

In English:

In Russian:

  • Как AutoML помогает создавать модели композитного ИИ — говорим о структурном обучении и фреймворке FEDOT - habr.com
  • Прогнозирование временных рядов с помощью AutoML - habr.com
  • Как мы “повернули реки вспять” на Emergency DataHack 2021, объединив гидрологию и AutoML - habr.com
  • Чистый AutoML для “грязных” данных: как и зачем автоматизировать предобработку таблиц в машинном обучении - ODS blog
  • Фреймворк автоматического машинного обучения FEDOT (Конференция Highload++ 2022) - presentation
  • Про настройку гиперпараметров ансамблей моделей машинного обучения - habr.com

In Chinese:

  • 生成式自动机器学习系统 (presentation at the "Open Innovations 2.0" conference) - youtube.com

Project Structure

The latest stable release of FEDOT is in the master branch.

The repository includes the following directories:

  • Package core contains the main classes and scripts. It is the core of the FEDOT framework
  • Package examples includes several how-to-use-cases where you can start to discover how FEDOT works
  • All unit and integration tests can be observed in the test directory
  • The sources of the documentation are in the docs directory

Current R&D and future plans

Currently, we are working on new features and trying to improve the performance and the user experience of FEDOT. The major ongoing tasks and plans:

  • Effective and ready-to-use pipeline templates for certain tasks and data types;
  • Integration with GPU via Rapids framework;
  • Alternative optimization methods of fixed-shaped pipelines;
  • Integration with MLFlow for import and export of the pipelines;
  • Improvement of the high-level API.

Also, we are doing several research tasks related to AutoML time-series benchmarking and multi-modal modeling.

Any contribution is welcome. Our R&D team is open for cooperation with other scientific teams as well as with industrial partners.

Documentation

Also, a detailed FEDOT API description is available in Read the Docs.

Contribution Guide

  • The contribution guide is available in this repository.

Acknowledgments

We acknowledge the contributors for their important impact and the participants of numerous scientific conferences and workshops for their valuable advice and suggestions.

Side Projects

  • The prototype of the web-GUI for FEDOT is available in the FEDOT.WEB repository.

Contacts

Supported by

Citation

@article{nikitin2021automated,
title = {Automated evolutionary approach for the design of composite machine learning pipelines}, author = {Nikolay O. Nikitin and Pavel Vychuzhanin and Mikhail Sarafanov and Iana S. Polonskaia and Ilia Revin and Irina V. Barabanova and Gleb Maximov and Anna V. Kalyuzhnaya and Alexander Boukhanovsky}, journal = {Future Generation Computer Systems}, year = {2021}, issn = {0167-739X}, doi = {https://doi.org/10.1016/j.future.2021.08.022}}
@inproceedings{polonskaia2021multi,
title={Multi-Objective Evolutionary Design of Composite Data-Driven Models}, author={Polonskaia, Iana S. and Nikitin, Nikolay O. and Revin, Ilia and Vychuzhanin, Pavel and Kalyuzhnaya, Anna V.}, booktitle={2021 IEEE Congress on Evolutionary Computation (CEC)}, year={2021}, pages={926-933}, doi={10.1109/CEC45853.2021.9504773}}

Other papers - in ResearchGate.

More Repositories

1

BAMT

Repository of a data modeling and analysis tool based on Bayesian networks
Python
117
star
2

open-source-ops

Полезные советы для open-source разработок (в том числе в области AI/ML)
HTML
104
star
3

Fedot.Industrial

Python framework for automated time series classification, regression and forecasting
TypeScript
84
star
4

GOLEM

Graph Optimiser for Learning and Evolution of Models
Python
61
star
5

GEFEST

Toolbox for the generative design of geometrically-encoded physical objects using numerical modelling and evolutionary optimization
Python
55
star
6

iOpt

Framework of intelligent optimization methods iOpt
Python
52
star
7

eXplain-NNs

Library with XAI methods for NNs
Python
46
star
8

FEDOT.Web

Graphic tool for the automated evolutionary design of composite models
JavaScript
37
star
9

rostok

Rostok is an open source library which provides the framework for generative co-design of mechatronic and robotic systems.
Python
36
star
10

Web-BAMT

Online tool for Bayesian Networks
TypeScript
35
star
11

StableGNN

Framework for autonomous learning of explainable graph neural networks
Jupyter Notebook
32
star
12

SAMPO

Open-source framework for adaptive manufacturing processes scheduling
Jupyter Notebook
27
star
13

blocksnet

Open library with tools for generation the city model and optimal requirements for future development with specified target parameters
Python
25
star
14

AutoTM

Automatic hyperparameters tuning for topic models (ARTM approach) using evolutionary algorithms
Python
25
star
15

ReDKG

Reinforcement learning on dynamic knowledge graphs
Python
20
star
16

ai-competency-model

Модель профессиональных компетенций в области ИИ
19
star
17

OCEANAI

Algorithms for Intelligent Assessment of Human Personality Traits based on His Multimodal Data for ranking potential candidates to perform professional responsibilities
Python
16
star
18

evoguess-ai

Component for finding decomposition sets and estimating hardness of SAT instances.
Python
8
star
19

ECG

A library for assisting in diagnostics of heart conditions from ECG
Python
7
star
20

asid

AutoML tool for imbalanced and small tabular datasets
Python
6
star
21

FEDOT.LLM

LLM-based prototype for nexgen AutoML
Jupyter Notebook
5
star
22

foressment_lib

AI-based assessment and forecasting of the state of complex technical objects.
Python
4
star
23

Polygon.Web

Online tools for automated model evaluation
HTML
1
star