黑盒优化库。
benderopt的Python项目详细描述
弯曲度
benderopt是一个黑盒优化库。
对于异步使用,可以在bender.dreem.com
实现的算法“parzen_估计器”类似于以下描述的tpe: Bergstra, James S., et al. “Algorithms for hyper-parameter optimization.” Advances in Neural Information Processing Systems.
演示
下面是我们想要最小化的函数的200个求值的比较。首先用随机估计器选择随机评价点。然后利用benderopt中的parzen_估计来选择评价点。
要最小化的函数如下:cos(x) + cos(2 * x + 1) + cos(y)
。
红色点对应于x和y的全局最小值在0和2pi之间的位置。
生成视频的代码可以在benchmark/benchmark_sinus2D
在这个例子中我们可以观察到,parzen估计比随机方法更倾向于探索局部最小值。在给定一定数量的评估的情况下,这可能会导致更好的优化。
目标
在黑箱优化中,我们有一个优化函数,但无法计算梯度,而且计算时间/资源代价很高。因此,我们希望在尽可能少的评价中找到一个好的勘探开发权衡,以获得最佳的超参数。 用例是:
- 机器学习模型的优化(神经网络的层数、激活函数等)
- 业务优化(市场营销、A/B测试)
- 大规模临床研究
代码最小示例
benderopt的一个优点是它使用类似json的对象表示,使用户更容易定义要优化的参数。这还允许与异步系统(如bender.dreem.com)轻松集成。
下面是一个最小的例子。
from benderopt import minimize
import numpy as np
# We want to minimize the sinus function between 0 and 2pi
def f(x):
return np.sin(x)
# We define the parameters we want to optimize:
optimization_problem_parameters = [
{
"name": "x",
"category": "uniform",
"search_space": {
"low": 0,
"high": 2 * np.pi,
}
}
]
# We launch the optimization
best_sample = minimize(f, optimization_problem_parameters, number_of_evaluation=50)
print(best_sample["x"], 3 * np.pi / 2)
> 4.710390692396651 4.71238898038469
最少文档:
优化问题
优化问题包含:
- 参数列表(即参数及其搜索空间)
- 观察列表(即列表中每个参数的值和相应的损失)
我们对每一个都使用json表示,例如
optimization_problem_data = {
"parameters": [
{
"name": "parameter_1",
"category": "uniform",
"search_space": {"low": 0, "high": 2 * np.pi, "step": 0.1}
},
{
"name": "parameter_2",
"category": "categorical",
"search_space": {"values": ["a", "b", "c"]}
}
],
"observations": [
{
"sample": {"parameter_1": 0.4, "parameter_2": "a"},
"loss": 0.1
},
{
"sample": {"parameter_1": 3.4, "parameter_2": "a"},
"loss": 0.1
},
{
"sample": {"parameter_1": 4.1, "parameter_2": "c"},
"loss": 0.1
},
]
}
优化器
优化器接受一个优化问题并建议新的预测。 换句话说,优化器获取一个参数列表及其搜索空间和过去评估的历史记录,以建议一个新的参数。
使用前面示例中的optimization_problem_data
:
from benderopt.base import OptimizationProblem, Observation
from benderopt.optimizer import optimizers
optimization_problem = OptimizationProblem.from_json(optimization_problem_data)
optimizer = optimizers["parzen_estimator"](optimization_problem)
sample = optimizer.suggest()
print(sample)
> {"parameter_1": 3.9, "parameter_2": "b"}
目前可用的优化器有random
和parzen_estimator
。
benderopt允许通过从BaseOptimizer
类继承优化器来轻松地添加新优化器。
您可以检查benderopt/optimizer/random.py
以获得一个最小的示例。
最小化功能
最小化上面在minimal example
部分中显示的函数实现是非常有策略性的:
optimization_problem = OptimizationProblem.from_list(optimization_problem_parameters)
optimizer = optimizers["parzen_estimator"](optimization_problem)
for _ in range(number_of_evaluation):
sample = optimizer.suggest()
loss = f(**sample)
observation = Observation.from_dict({"loss": loss, "sample": sample})
optimization_problem.add_observation(observation)
优化问题的观测历史列表在每次迭代时都被扩展为一个新的观测。这允许优化器在下一个建议中考虑它们。
统一参数
parameter | type | default | comments |
---|---|---|---|
low | mandatory | - | lowest possible value: all values will be greater than or equal to low |
high | mandatory | - | highest value: all values will be stricly less than high |
step | optionnal | None | discretize the set of possible values: all values will follow 'value = low + k * step with k belonging to [0, K]' |
例如
{
"name": "x",
"category": "uniform",
"search_space": {
"low": 0,
"high": 2 * np.pi,
# "step": np.pi / 8
}
}
对数均匀参数
parameter | type | default | comments |
---|---|---|---|
low | mandatory | - | lowest possible value: all values will be greater than or equal to low |
high | mandatory | - | highest value: all values will be stricly less than high |
step | optionnal | None | "discretize the set of possible values: all values will follow 'value = low + k * step with k belonging to [0, K]'" |
base | optional | 10 | logarithmic base to use |
例如
{
"name": "x",
"category": "loguniform",
"search_space": {
"low": 1e-4,
"high": 1e-2,
# "step": 1e-5,
# "base": 10,
}
}
正常参数
parameter | type | default | comments |
---|---|---|---|
low | optionnal | -inf | lowest possible value: all values will be greater than or equal to low |
high | optionnal | inf | highest value: all values will be stricly less than high |
mu | mandatory | - | mean value: all values will be initially drawn following a gaussian centered at mu with sigma variance |
sigma | mandatory | - | sigma value: all values will be initially drawn following a gaussian centered at mu with sigma variance |
step | optionnal | None | "discretize the set of possible values: all values will follow 'value = low + k * step with k belonging to [0, K]'" |
例如
{
"name": "x",
"category": "normal",
"search_space": {
# "low": 0,
# "high": 10,
"mu": 5,
"sigma": 1
# "step": 0.01,
}
}
对数正态参数
parameter | type | default | comments |
---|---|---|---|
low | optionnal | -inf | lowest possible value: all values will be greater than or equal to low |
high | optionnal | inf | highest value: all values will be stricly less than high |
mu | mandatory | - | mean value: all values will be initially drawn following a gaussian centered at mu with sigma variance |
sigma | mandatory | - | sigma value: all values will be initially drawn following a gaussian centered at mu with sigma variance |
step | optionnal | None | "discretize the set of possible values: all values will follow 'value = low + k * step with k belonging to [0, K]'" |
base | optional | 10 | logarithmic base to use |
例如
{
"name": "x",
"category": "lognormal",
"search_space": {
# "low": 1e-6,
# "high": 1e0,
"mu": 1e-3,
"sigma": 1e-2
# "step": 1e-7,
# "base": 10,
}
}
分类参数
parameter | type | default | comments |
---|---|---|---|
values | mandatory | - | list of categories: all values will be sampled from this list |
probabilities | optionnal | number_of_values * [1 / number_of_values] | list of probabilities: all values will be initially drawn following this probability list |
例如
{
"name": "x",
"category": "categorical",
"search_space": {
"values": ["a", "b", "c", "d"],
# "probabilities": [0.1, 0.2, 0.2, 0.2, 0.3]
}
}