package | |
---|---|
tests | |
docs | |
license | |
stats | |
support |
FEDOT is an open-source framework for automated modeling and machine learning (AutoML) problems. This framework is distributed under the 3-Clause BSD license.
It provides automatic generative design of machine learning pipelines for various real-world problems. The core of FEDOT is based on an evolutionary approach and supports classification (binary and multiclass), regression, clustering, and time series prediction problems.
The key feature of the framework is the complex management of interactions between the various blocks of pipelines. It is represented as a graph that defines connections between data preprocessing and model blocks.
The project is maintained by the research team of the Natural Systems Simulation Lab, which is a part of the National Center for Cognitive Research of ITMO University.
More details about FEDOT are available in the next video:
FEDOT provides a high-level API that allows you to use its capabilities in a simple way. The API can be used for classification, regression, and time series forecasting problems.
To use the API, follow these steps:
- Import
Fedot
class
from fedot.api.main import Fedot
- Initialize the Fedot object and define the type of modeling problem. It provides a fit/predict interface:
Fedot.fit()
begins the optimization and returns the resulting composite pipeline;Fedot.predict()
predicts target values for the given input data using already fitted pipeline;Fedot.get_metrics()
estimates the quality of predictions using selected metrics.
NumPy arrays, Pandas DataFrames, and the file's path can be used as sources of input data. In case below, x_train, y_train and x_test are numpy.ndarray():
model = Fedot(problem='classification')
model.fit(features=x_train, target=y_train)
prediction = model.predict(features=x_test)
metrics = model.get_metrics()
More information about the API is available in documentation and advanced approaches are in Examples & Tutorials section.
Jupyter notebooks with tutorials are located in the examples repository. There you can find the following guides:
- Intro to AutoML
- Intro to FEDOT functionality
- Intro to time series forecasting with FEDOT
- Advanced time series forecasting
- Gap-filling in time series and out-of-sample forecasting
- Hybrid modelling with custom models
Notebooks are issued with the corresponding release versions (the default version is 'latest').
Also, external examples are available:
Extended examples:
- Credit scoring problem, i.e. binary classification task
- Time series forecasting, i.e. random process regression
- Spam detection, i.e. natural language preprocessing
- Movie rating prediction with multi-modal data
Also, several video tutorials are available (in Russian).
We also published several posts and news devoted to the different aspects of the framework:
In English:
- How AutoML helps to create composite AI? - towardsdatascience.com
- AutoML for time series: definitely a good idea - towardsdatascience.com
- AutoML for time series: advanced approaches with FEDOT framework - towardsdatascience.com
- Winning a flood-forecasting hackathon with hydrology and AutoML - towardsdatascience.com
- Clean AutoML for “Dirty” Data - towardsdatascience.com
- FEDOT as a factory of human-competitive results - youtube.com
- Hyperparameters Tuning for Machine Learning Model Ensembles - towardsdatascience.com
In Russian:
- Как AutoML помогает создавать модели композитного ИИ — говорим о структурном обучении и фреймворке FEDOT - habr.com
- Прогнозирование временных рядов с помощью AutoML - habr.com
- Как мы “повернули реки вспять” на Emergency DataHack 2021, объединив гидрологию и AutoML - habr.com
- Чистый AutoML для “грязных” данных: как и зачем автоматизировать предобработку таблиц в машинном обучении - ODS blog
- Фреймворк автоматического машинного обучения FEDOT (Конференция Highload++ 2022) - presentation
- Про настройку гиперпараметров ансамблей моделей машинного обучения - habr.com
In Chinese:
- 生成式自动机器学习系统 (presentation at the "Open Innovations 2.0" conference) - youtube.com
The latest stable release of FEDOT is on the master branch.
The repository includes the following directories:
- Package core contains the main classes and scripts. It is the core of FEDOT framework
- Package examples includes several how-to-use-cases where you can start to discover how FEDOT works
- All unit and integration tests can be observed in the test directory
- The sources of the documentation are in the docs
Also, you can check benchmarking a repository that was developed to provide a comparison of FEDOT against some well-known AutoML frameworks.
Currently, we are working on new features and trying to improve the performance and the user experience of FEDOT. The major ongoing tasks and plans:
- Effective and ready-to-use pipeline templates for certain tasks and data types;
- Integration with GPU via Rapids framework;
- Alternative optimization methods of fixed-shaped pipelines;
- Integration with MLFlow for import and export of the pipelines;
- Improvement of high-level API.
Also, we are doing several research tasks related to AutoML time-series benchmarking and multi-modal modeling.
Any contribution is welcome. Our R&D team is open for cooperation with other scientific teams as well as with industrial partners.
The general description is available in FEDOT.Docs repository.
Also, a detailed FEDOT API description is available in the Read the Docs.
- The contribution guide is available in the repository.
We acknowledge the contributors for their important impact and the participants of the numerous scientific conferences and workshops for their valuable advice and suggestions.
- The prototype of web-GUI for FEDOT is available in FEDOT.WEB repository.
- Telegram channel for solving problems and answering questions on FEDOT
- Natural System Simulation Team
- Anna Kalyuzhnaya, Team leader ([email protected])
- Newsfeed
- Youtube channel
- @article{nikitin2021automated,
- title = {Automated evolutionary approach for the design of composite machine learning pipelines}, author = {Nikolay O. Nikitin and Pavel Vychuzhanin and Mikhail Sarafanov and Iana S. Polonskaia and Ilia Revin and Irina V. Barabanova and Gleb Maximov and Anna V. Kalyuzhnaya and Alexander Boukhanovsky}, journal = {Future Generation Computer Systems}, year = {2021}, issn = {0167-739X}, doi = {https://doi.org/10.1016/j.future.2021.08.022}}
- @inproceedings{polonskaia2021multi,
- title={Multi-Objective Evolutionary Design of Composite Data-Driven Models}, author={Polonskaia, Iana S. and Nikitin, Nikolay O. and Revin, Ilia and Vychuzhanin, Pavel and Kalyuzhnaya, Anna V.}, booktitle={2021 IEEE Congress on Evolutionary Computation (CEC)}, year={2021}, pages={926-933}, doi={10.1109/CEC45853.2021.9504773}}
Other papers - in ResearchGate.