超参优化工具总结(2)——Hyperopt

赏梓

2023-12-01

Homepage：https://github.com/hyperopt

特性：Hyperopt是一个sklearn的Python库，在搜索空间上进行串行和并行优化，搜索空间可以是实值，离散和条件维度（real-valued, discrete, and conditional dimensions.）。它支持跨多台机器的并行化，并使用 MongoDb 作为存储超参数组合结果的中心数据库。

使用方法及配置

使用方法：

https://github.com/hyperopt/hyperopt#getting-started

#定义目标函数
def objective(args):
    case, val = args
    if case == 'case 1':
        return val
    else:
        return val ** 2
    
#定义搜索空间
from hyperopt import hp
space = hp.choice('a',
    [
        ('case 1', 1 + hp.lognormal('c1', 0, 1)),
        ('case 2', hp.uniform('c2', -10, 10))
    ])

#取得最优参数
from hyperopt import fmin, tpe
best = fmin(objective, space, algo=tpe.suggest, max_evals=100)

#打印输出
print best
# -> {'a': 1, 'c2': 0.01420615366247227}
print hyperopt.space_eval(space, best)
# -> ('case 2', 0.01420615366247227}

支持系统：linux

优化库基于： hyperopt-sklearn和hyperas，这两个用于模型选择和优化的函数库分别建立在scikit-learn和keras的基础上

适用范围：Machine Learning

并行计算：Using mongodb

搜索空间定义

https://github.com/hyperopt/hyperopt/wiki/FMin#2-defining-a-search-space

from hyperopt import hp
space = hp.choice('a',
    [
        ('case 1', 1 + hp.lognormal('c1', 0, 1)),
        ('case 2', hp.uniform('c2', -10, 10))
    ])

'a' - 可选类别
'c1' - 'case 1'中正参数
'c2' - 'case 2'中给定边界的实值参数
参数表达式

基于scikit-learn的搜索空间实例

from hyperopt import hp
space = hp.choice('classifier_type', [
    {
        'type': 'naive_bayes',
    },
    {
        'type': 'svm',
        'C': hp.lognormal('svm_C', 0, 1),
        'kernel': hp.choice('svm_kernel', [
            {'ktype': 'linear'},
            {'ktype': 'RBF', 'width': hp.lognormal('svm_rbf_width', 0, 1)},
            ]),
    },
    {
        'type': 'dtree',
        'criterion': hp.choice('dtree_criterion', ['gini', 'entropy']),
        'max_depth': hp.choice('dtree_max_depth',
            [None, hp.qlognormal('dtree_max_depth_int', 3, 1, 1)]),
        'min_samples_split': hp.qlognormal('dtree_min_samples_split', 2, 1, 1),
    },
    ])

基于pyII加入非统计表达式（Adding Non-Stochastic Expressions with pyll）

优化算法

随机搜索
Tree of Parzen Estimators（TPE）
Adaptive TPE

输出示例

print(estim.best_model()) //输出最优的模型
# {'learner': ExtraTreesClassifier(bootstrap=True, class_weight=None, criterion='entropy',
#           max_depth=None, max_features=0.959202875857,
#           max_leaf_nodes=None, min_impurity_decrease=0.0,
#           min_impurity_split=None, min_samples_leaf=1,
#           min_samples_split=2, min_weight_fraction_leaf=0.0,
#           n_estimators=20, n_jobs=1, oob_score=False, random_state=3,
#           verbose=False, warm_start=False), 'preprocs': (), 'ex_preprocs': ()}"

超参优化工具总结(2)——Hyperopt

Homepage：https://github.com/hyperopt

使用方法及配置

搜索空间定义

优化算法

输出示例

相关阅读

相关文章

相关问答

相关文档