FLAML

A fast and lightweight AutoML library.
授权协议 MIT License
开发语言 Python
所属分类 神经网络/人工智能、 机器学习/深度学习
软件类型 开源软件
地区 不详
投 递 者 越勇
操作系统 跨平台
开源组织
适用人群 未知
 软件概览

Build

FLAML - Fast and Lightweight AutoML


FLAML is a lightweight Python library that finds accurate machinelearning models automatically, efficiently and economically. It frees users from selectinglearners and hyperparameters for each learner. It is fast and economical.The simple and lightweight design makes it easy to extend, such asadding customized learners or metrics. FLAML is powered by a new, cost-effectivehyperparameter optimizationand learner selection method invented by Microsoft Research.FLAML leverages the structure of the search space to choose a search order optimized for both cost and error. For example, the system tends to propose cheap configurations at the beginning stage of the search,but quickly moves to configurations with high model complexity and large sample size when needed in the later stage of the search. For another example, it favors cheap learners in the beginning but penalizes them later if the error improvement is slow. The cost-bounded search and cost-based prioritization make a big difference in the search efficiency under budget constraints.

FLAML has a .NET implementation as well from ML.NET Model Builder. This ML.NET blog describes the improvement brought by FLAML.

Installation

FLAML requires Python version >= 3.6. It can be installed from pip:

pip install flaml

To run the notebook example,install flaml with the [notebook] option:

pip install flaml[notebook]

Quickstart

  • With three lines of code, you can start using this economical and fastAutoML engine as a scikit-learn style estimator.
from flaml import AutoML
automl = AutoML()
automl.fit(X_train, y_train, task="classification")
  • You can restrict the learners and use FLAML as a fast hyperparameter tuningtool for XGBoost, LightGBM, Random Forest etc. or a customized learner.
automl.fit(X_train, y_train, task="classification", estimator_list=["lgbm"])
  • You can also run generic ray-tune style hyperparameter tuning for a custom function.
from flaml import tune
tune.run(train_with_config, config={…}, low_cost_partial_config={…}, time_budget_s=3600)

Advantages

  • For common machine learning tasks like classification and regression, find quality models with small computational resources.
  • Users can choose their desired customizability: minimal customization (computational resource budget), medium customization (e.g., scikit-style learner, search space and metric), full customization (arbitrary training and evaluation code).
  • Allow human guidance in hyperparameter tuning to respect prior on certain subspaces but also able to explore other subspaces. Read more about thehyperparameter optimization methodsin FLAML here. They can be used beyond the AutoML context.And they can be used in distributed HPO frameworks such as ray tune or nni.
  • Support online AutoML: automatic hyperparameter tuning for online learning algorithms. Read more about the online AutoML method in FLAML here.

Examples

  • A basic classification example.
from flaml import AutoML
from sklearn.datasets import load_iris
# Initialize an AutoML instance
automl = AutoML()
# Specify automl goal and constraint
automl_settings = {
    "time_budget": 10,  # in seconds
    "metric": 'accuracy',
    "task": 'classification',
    "log_file_name": "test/iris.log",
}
X_train, y_train = load_iris(return_X_y=True)
# Train with labeled input data
automl.fit(X_train=X_train, y_train=y_train,
           **automl_settings)
# Predict
print(automl.predict_proba(X_train))
# Export the best model
print(automl.model)
  • A basic regression example.
from flaml import AutoML
from sklearn.datasets import fetch_california_housing
# Initialize an AutoML instance
automl = AutoML()
# Specify automl goal and constraint
automl_settings = {
    "time_budget": 10,  # in seconds
    "metric": 'r2',
    "task": 'regression',
    "log_file_name": "test/boston.log",
}
X_train, y_train = fetch_california_housing(return_X_y=True)
# Train with labeled input data
automl.fit(X_train=X_train, y_train=y_train,
           **automl_settings)
# Predict
print(automl.predict(X_train))
# Export the best model
print(automl.model)
  • Time series forecasting.
# pip install flaml[forecast]
import numpy as np
from flaml import AutoML
X_train = np.arange('2014-01', '2021-01', dtype='datetime64[M]')
y_train = np.random.random(size=72)
automl = AutoML()
automl.fit(X_train=X_train[:72],  # a single column of timestamp
           y_train=y_train,  # value for each timestamp
           period=12,  # time horizon to forecast, e.g., 12 months
           task='forecast', time_budget=15,  # time budget in seconds
           log_file_name="test/forecast.log",
          )
print(automl.predict(X_train[72:]))
  • Learning to rank.
from sklearn.datasets import fetch_openml
from flaml import AutoML
X_train, y_train = fetch_openml(name="credit-g", return_X_y=True, as_frame=False)
y_train = y_train.cat.codes
# not a real learning to rank dataaset
groups = [200] * 4 + [100] * 2    # group counts
automl = AutoML()
automl.fit(
    X_train, y_train, groups=groups,
    task='rank', time_budget=10,    # in seconds
)

More examples can be found in notebooks.

Documentation

Please find the API documentation here.

Please find demo and tutorials of FLAML here.

For more technical details, please check our papers.

@inproceedings{wang2021flaml,
    title={FLAML: A Fast and Lightweight AutoML Library},
    author={Chi Wang and Qingyun Wu and Markus Weimer and Erkang Zhu},
    year={2021},
    booktitle={MLSys},
}

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to aContributor License Agreement (CLA) declaring that you have the right to, and actually do, grant usthe rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

If you are new to GitHub here is a detailed help source on getting involved with development on GitHub.

When you submit a pull request, a CLA bot will automatically determine whether you need to providea CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructionsprovided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct.For more information see the Code of Conduct FAQ orcontact opencode@microsoft.com with any additional questions or comments.

Developing

Setup

git clone https://github.com/microsoft/FLAML.git
pip install -e .[test,notebook]

Docker

We provide a simple Dockerfile.

docker build git://github.com/microsoft/FLAML -t flaml-dev
docker run -it flaml-dev

Develop in Remote Container

If you use vscode, you can open the FLAML folder in a Container.We have provided the configuration in .devcontainer.

Pre-commit

Run pre-commit install to install pre-commit into your git hooks. Before you commit, runpre-commit run to check if you meet the pre-commit requirements. If you use Windows (without WSL) and can't commit after installing pre-commit, you can run pre-commit uninstall to uninstall the hook. In WSL or Linux this is supposed to work.

Coverage

Any code you commit should not decrease coverage. To run all unit tests:

coverage run -m pytest test

Then you can see the coverage report bycoverage report -m or coverage html.If all the tests are passed, please also test run notebook/flaml_automl to make sure your commit does not break the notebook example.

Authors

  • Chi Wang
  • Qingyun Wu

Contributors (alphabetical order): Amir Aghaei, Vijay Aski, Sebastien Bubeck, Surajit Chaudhuri, Nadiia Chepurko, Ofer Dekel, Alex Deng, Anshuman Dutt, Nicolo Fusi, Jianfeng Gao, Johannes Gehrke, Niklas Gustafsson, Silu Huang, Dongwoo Kim, Christian Konig, John Langford, Menghao Li, Mingqin Li, Zhe Liu, Naveen Gaur, Paul Mineiro, Vivek Narasayya, Jake Radzikowski, Marco Rossi, Amin Saied, Neil Tenenholtz, Olga Vrousgou, Markus Weimer, Yue Wang, Qingyun Wu, Qiufeng Yin, Haozhe Zhang, Minjia Zhang, XiaoYun Zhang, Eric Zhu, and open-source contributors.

License

MIT License

  • 一、简介 FLAML(A Fast and Lightweight AutoML Library),是由微软主推的一个全新的高效轻量级自动化机器学习框架。 论文 arXiv 地址 | FLAML: A Fast and Lightweight AutoML Library FLAML Github项目地址 | A Fast Library for Automated Machine Learnin

  • 一、概述         AutoML在近年来的各类机器学习和Kaggle比赛中层出不穷,明显是机器学习的一个趋势,自动化机器学习提供了方法和流程,使机器学习可供非机器学习专家使用,以提高机器学习的效率并加速机器学习的研究。         FLAML是今年由微软主推的一个全新的高效轻量级自动化机器学习框架。         FLAML 是一个轻量级的 Python 库,可自动、高效且经济地找到准

  • 一、引言 FLAML(A Fast and Lightweight AutoML Library),是由微软主推的一个全新的高效轻量级自动化机器学习框架。 论文 arXiv 地址 | FLAML: A Fast and Lightweight AutoML Library FLAML Github项目地址 | A Fast Library for Automated Machine Learnin

  • 一、版本背景 flaml == 1.1.3 sciket-learn == 0.23.0 二、一路报错 2.1、SyntaxError: future feature annotations is not defined Traceback (most recent call last): File "C:/Users/dell/Desktop/AI/run.py", line 151, in

  • AutoML之flaml:基于OpenML数据集利用pipeline结合flaml框架(自动化选择最佳模型+重加载模型并推理)实现预测航班是否延误二分类案例 目录 基于OpenML数据集利用pipeline结合flaml框架(自动化选择最佳模型+重加载模型并推理)实现预测航班是否延误二分类案例  # 1、定义数据集 # 3、模型流水线自动化调优 # 3.1、构建建模流水线 # 3.2、模型训练 #