我在VGG16模型上使用分层洗牌分割实现网格搜索CV,我正在将数据流从dataframe转换为numpy数组,以便能够将它们传递到网格搜索CV,但我有内存问题,因为我正在对10000张图像进行培训。如何解决这个问题,或者有任何其他解决方案使用K折叠并在5次折叠和每次训练和拟合时执行超参数调整
这是我的函数,train_数据是一个数据帧,BuildModel是vgg16模型
Y = train_data[['label']]
data = ImageDataGenerator(preprocessing_function = preprocess_input)
data_generator = data.flow_from_dataframe(train_data, directory = path,
x_col = "filename", y_col = "label",
class_mode = "binary", target_size=(224, 224), batch_size = len(train_data))
model = KerasClassifier(build_fn = buildModel, verbose=0)
batch_size = [10, 20, 40, 60, 80, 100]
epochs = [10, 50, 100]
optimizer = ['SGD', 'RMSprop', 'Adagrad', 'Adadelta', 'Adam', 'Adamax', 'Nadam']
param_grid = dict(batch_size=batch_size, epochs=epochs,optimizer=optimizer)
x = data_generator.next()[0]
print(x.shape)
y = data_generator.next()[1]
print(y.shape)
stratifiedSplit = StratifiedShuffleSplit(n_splits=5, test_size=0.3)
kfold_splits = 5
grid = GridSearchCV(estimator=model,
n_jobs=-1,
verbose=1,
return_train_score=True,
cv=stratifiedSplit,
param_grid=param_grid,)
grid_result = grid.fit(x, y, ) #callbacks=[tbCallBack]
cross_val_score(grid, x, y, cv=stratifiedSplit)
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
print("%f (%f) with: %r" % (mean, stdev, param))
目前没有回答
相关问题 更多 >
编程相关推荐