<p>以前,我建立了一个类,包装<strong>“HyperOpt”</strong>以满足我的需要。在</p>
<p>我会尽快把它最小化,这样你就可以使用它了。下面是代码和一些注释,以帮助您解决问题:</p>
<pre><code>import numpy as np
from hyperopt import fmin, tpe, hp, STATUS_OK, Trials
import xgboost as xgb
max_float_digits = 4
def rounded(val):
return '{:.{}f}'.format(val, max_float_digits)
class HyperOptTuner(object):
"""
Tune my parameters!
"""
def __init__(self, dtrain, dvalid, early_stopping=200, max_evals=200):
self.counter = 0
self.dtrain = dtrain
self.dvalid = dvalid
self.early_stopping = early_stopping
self.max_evals = max_evals
self.tuned_params = None
def score(self, params):
self.counter += 1
# Edit params
print("Iteration {}/{}".format(self.counter, self.max_evals))
num_round = int(params['n_estimators'])
del params['n_estimators']
watchlist = [(self.dtrain, 'train'), (self.dvalid, 'eval')]
model = xgb.train(params, self.dtrain, num_round, evals=watchlist, early_stopping_rounds=self.early_stopping,
verbose_eval=False)
n_epoach = model.best_ntree_limit
score = model.best_score
params['n_estimators'] = n_epoach
params = dict([(key, rounded(params[key]))
if type(params[key]) == float
else (key, params[key])
for key in params])
print "Trained with: "
print params
print "\tScore {0}\n".format(score)
return {'loss': 1 - score, 'status': STATUS_OK, 'params': params}
def optimize(self, trials):
space = {
'n_estimators': 2000, # hp.quniform('n_estimators', 10, 1000, 10),
'eta': hp.quniform('eta', 0.025, 0.3, 0.025),
'max_depth': hp.choice('max_depth', np.arange(1, 9, dtype=int)),
'min_child_weight': hp.choice('min_child_weight', np.arange(1, 10, dtype=int)),
'subsample': hp.quniform('subsample', 0.3, 1, 0.05),
'gamma': hp.quniform('gamma', 0.1, 20, 0.1),
'colsample_bytree': hp.quniform('colsample_bytree', 0.5, 1, 0.25),
'eval_metric': 'map',
'objective': 'rank:pairwise',
'silent': 1
}
fmin(self.score, space, algo=tpe.suggest, trials=trials, max_evals=self.max_evals),
min_loss = 1
min_params = {}
for trial in trials.trials:
tmp_loss, tmp_params = trial['result']['loss'], trial['result']['params']
if tmp_loss < min_loss:
min_loss, min_params = tmp_loss, tmp_params
print("Winning params:")
print(min_params)
print "\tScore: {}".format(1-min_loss)
self.tuned_params = min_params
def tune(self):
print "Tuning...\n"
# Trials object where the history of search will be stored
trials = Trials()
self.optimize(trials)
</code></pre>
<p>所以我使用了一个类,主要用来定义参数和保存结果以备将来使用。有2个电源功能。在</p>
<ol>
<li><p><strong>optimize()</strong>创建来定义我们的“搜索空间”,计算
使错误最小化的最佳参数(所以请注意
<strong>最小化</strong>错误)并保存找到的最佳参数。还添加了一些指纹以帮助您遵循流程。</p></li>
<li><p><strong>score()</strong>用于使用特定
“搜索空间”中的超参数。它用的是提前停车
在<strong>类中定义</strong>。因为我不需要用十字架
我用过的验证xgb.列车(),但您可以将其更改为xgb.cv()
这确实支持早期停止子弹。还添加了指纹
帮助您遵循流程。score返回1-score(因为我已经
计算出的地图需要增加,所以如果
你计算一个像RMSE这样的错误,只需按原样返回分数。)</p></li>
</ol>
<p>这是在有了dtrain和dtest矩阵之后,如何从代码中激活它:</p>
^{pr2}$
<p>其中<code>max_evals</code>是“搜索网格”的大小</p>
<p>遵循这些指导原则,如果你有困难,请告诉我。在</p>