h2o GLM网格搜索lambda valu问题的回答

h2o GLM网格搜索lambda valu

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

最终编辑 有多种获取lambda的方法（如下所示），但是这里有两种获得lambda的简洁方法（注意，完全可复制的代码位于底部） 如果有<code>lambda_search = True</code>，那么可以查看<code>lambda_search</code>列下的模型摘要表，并查看为<code>lambda.min</code>设置了什么值，这是最好的lambda <pre><code>model.summary()['lambda_search'] </code></pre> 它将生成一个字符串类似于： ^{pr2}$ 如果不使用lambda搜索，也不设置lambda值（或设置它），也可以使用摘要表 <pre><code>model.summary()['regularization'] </code></pre> 输出如下： <pre><code>['Elastic Net (alpha = 0.5, lambda = 0.01289 )'] </code></pre> 其他选项： 看看模型的实际参数： <code>best.actual_params['lambda']</code> <code>best.actual_params['alpha']</code> 在网格搜索结果中，<code>best</code>是您的最佳模型 首次编辑 为了得到你能做的最好的模特 <pre><code>grid_table = grid.get_grid(sort_by='r2', decreasing=True) best = grid_table.models[0] </code></pre> 然后您可以使用： <pre><code>best.actual_params['lambda'] </code></pre> 完全可复制示例 <pre><code>import h2o from h2o.estimators.glm import H2OGeneralizedLinearEstimator h2o.init() # import the airlines dataset: # This dataset is used to classify whether a flight will be delayed 'YES' or not "NO" # original data can be found at http://www.transtats.bts.gov/ airlines= h2o.import_file("https://s3.amazonaws.com/h2o-public-test-data/smalldata/airlines/allyears2k_headers.zip") # convert columns to factors airlines["Year"]= airlines["Year"].asfactor() airlines["Month"]= airlines["Month"].asfactor() airlines["DayOfWeek"] = airlines["DayOfWeek"].asfactor() airlines["Cancelled"] = airlines["Cancelled"].asfactor() airlines['FlightNum'] = airlines['FlightNum'].asfactor() # set the predictor names and the response column name predictors = ["Origin", "Dest", "Year", "UniqueCarrier", "DayOfWeek", "Month", "Distance", "FlightNum"] response = "IsDepDelayed" # split into train and validation sets train, valid= airlines.split_frame(ratios = [.8]) # try using the `lambda_` parameter: # initialize your estimator airlines_glm = H2OGeneralizedLinearEstimator(family = 'binomial', lambda_ = .0001) # then train your model airlines_glm.train(x = predictors, y = response, training_frame = train, validation_frame = valid) # print the auc for the validation data print(airlines_glm.auc(valid=True)) # Example of values to grid over for `lambda` # import Grid Search from h2o.grid.grid_search import H2OGridSearch # select the values for lambda_ to grid over hyper_params = {'lambda': [1, 0.5, 0.1, 0.01, 0.001, 0.0001, 0.00001, 0]} # this example uses cartesian grid search because the search space is small # and we want to see the performance of all models. For a larger search space use # random grid search instead: {'strategy': "RandomDiscrete"} # initialize the glm estimator airlines_glm_2 = H2OGeneralizedLinearEstimator(family = 'binomial') # build grid search with previously made GLM and hyperparameters grid = H2OGridSearch(model = airlines_glm_2, hyper_params = hyper_params, search_criteria = {'strategy': "Cartesian"}) # train using the grid grid.train(x = predictors, y = response, training_frame = train, validation_frame = valid) # sort the grid models by decreasing AUC grid_table = grid.get_grid(sort_by = 'auc', decreasing = True) print(grid_table) best = grid_table.models[0] print(best.actual_params['lambda']) </code></pre>

h2o GLM网格搜索lambda valu

1 个回答

相关Python问题