我用Keras和Sklearn用一个简单的神经网络做了一些实验,我遇到了一些意想不到的结果
在我的第一个实验中,NN有一个包含64个神经元的隐藏层,我使用StratifiedKFold类运行一个包含5个拆分的KFold
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
from sklearn.model_selection import StratifiedKFold
import numpy as np
import tensorflow as tf
import random
seed = 7
random.seed(seed)
np.random.seed(seed)
tf.random.set_seed(seed)
kfold = StratifiedKFold(n_splits=5, shuffle=True, random_state=seed)
for train, test in kfold.split(X, Y):
model = Sequential()
model.add(Dense(64, input_dim=12, activation='relu'))
model.add(Dense(1))
model.compile(optimizer='rmsprop', loss='mse', metrics=['mae'])
model.summary()
model.fit(X_train[train], Y[train], epochs=10,verbose=1)
y_pred=model.predict(X_train[test])
mse_value, mae_value=model.evaluate(X_train[test], Y[test], verbose=1)
print(mse_value)
在第一次折叠中,我打印了以下信息:
Model: "sequential_169"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_342 (Dense) (None, 64) 832
_________________________________________________________________
dense_343 (Dense) (None, 1) 65
=================================================================
Total params: 897
Trainable params: 897
Non-trainable params: 0
_________________________________________________________________
Epoch 1/10
163/163 [==============================] - 0s 520us/step - loss: 23.5748 - mae: 4.6046
Epoch 2/10
163/163 [==============================] - 0s 503us/step - loss: 1.9301 - mae: 1.0770
Epoch 3/10
163/163 [==============================] - 0s 502us/step - loss: 1.0503 - mae: 0.8026
Epoch 4/10
163/163 [==============================] - 0s 492us/step - loss: 0.7895 - mae: 0.6828
Epoch 5/10
163/163 [==============================] - 0s 503us/step - loss: 0.6499 - mae: 0.6171
Epoch 6/10
163/163 [==============================] - 0s 524us/step - loss: 0.5652 - mae: 0.5795
Epoch 7/10
163/163 [==============================] - 0s 506us/step - loss: 0.5806 - mae: 0.5819
Epoch 8/10
163/163 [==============================] - 0s 506us/step - loss: 0.4949 - mae: 0.5497
Epoch 9/10
163/163 [==============================] - 0s 493us/step - loss: 0.4864 - mae: 0.5418
Epoch 10/10
163/163 [==============================] - 0s 492us/step - loss: 0.4942 - mae: 0.5455
41/41 [==============================] - 0s 474us/step - loss: 0.4861 - mae: 0.5457
0.48606643080711365
...
请注意,训练时损失从23.5748增加到0.4942
在第二个实验中,我使用GridSearchCV类对要使用的层数执行网格搜索。(为了说明我的问题,我只试了一层)。我还将与前一个实验中相同的kfold策略传递给GridSearchCV的构造函数
from tensorflow.keras.wrappers.scikit_learn import KerasRegressor
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from sklearn.model_selection import GridSearchCV
import tensorflow as tf
import random
seed = 7
random.seed(seed)
np.random.seed(seed)
tf.random.set_seed(seed)
def create_model(hidden_layers=1):
# Initialize the constructor
model = Sequential()
# Add hidden layers
for i in range(hidden_layers):
if i == 0:
model.add(Dense(64, input_dim=12, activation='relu'))
else:
model.add(Dense(64, activation='relu'))
# Add an output layer
model.add(Dense(1))
model.compile(optimizer='rmsprop', loss='mse', metrics=["mae"])
model.summary()
return model
model = KerasRegressor(build_fn=create_model, epochs=10, verbose=1)
param_grid = dict(hidden_layers=[1])
kfold = StratifiedKFold(n_splits=5, shuffle=True, random_state=seed)
grid = GridSearchCV(estimator=model, param_grid=param_grid,
scoring=["neg_mean_absolute_error", "neg_mean_squared_error", "r2"],
refit="r2",
n_jobs=1, cv=kfold)
grid_result = grid.fit(X, Y)
使用此方法,在第一次折叠时,我得到以下输出:
Model: "sequential_180"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_364 (Dense) (None, 64) 832
_________________________________________________________________
dense_365 (Dense) (None, 1) 65
=================================================================
Total params: 897
Trainable params: 897
Non-trainable params: 0
_________________________________________________________________
Epoch 1/10
163/163 [==============================] - 0s 527us/step - loss: 9.8205 - mae: 2.3366
Epoch 2/10
163/163 [==============================] - 0s 479us/step - loss: 1.0685 - mae: 0.8089
Epoch 3/10
163/163 [==============================] - 0s 503us/step - loss: 0.9351 - mae: 0.7488
Epoch 4/10
163/163 [==============================] - 0s 503us/step - loss: 0.9602 - mae: 0.7560
Epoch 5/10
163/163 [==============================] - 0s 502us/step - loss: 1.0195 - mae: 0.7830
Epoch 6/10
163/163 [==============================] - 0s 494us/step - loss: 0.9774 - mae: 0.7761
Epoch 7/10
163/163 [==============================] - 0s 489us/step - loss: 0.9569 - mae: 0.7413
Epoch 8/10
163/163 [==============================] - 0s 488us/step - loss: 0.9772 - mae: 0.7794
Epoch 9/10
163/163 [==============================] - 0s 464us/step - loss: 0.8716 - mae: 0.7259
Epoch 10/10
163/163 [==============================] - 0s 494us/step - loss: 0.8687 - mae: 0.7248
41/41 [==============================] - 0s 380us/step
...
在这里,损失函数的行为与第一次实验完全不同;从9.8205上升到0.8687
因为我是:
seed = 7
random.seed(seed)
np.random.seed(seed)
tf.random.set_seed(seed)
kfold = StratifiedKFold(n_splits=5, shuffle=True, random_state=seed)
Model: "sequential_XXX"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_XXX (Dense) (None, 64) 832
_________________________________________________________________
dense_XXY (Dense) (None, 1) 65
=================================================================
Total params: 897
Trainable params: 897
Non-trainable params: 0
我希望两个神经元得到相同的结果(至少在第一次折叠中),但在损失函数中得到不同的结果
第一个实验中的NN行为与第二个实验中的NN行为如何可能不同
问题是我在第一个实验中用X_训练,在第二个实验中用X训练。X_列车是X的缩放版本
尽管如此,马可关于种子的观点也适用。请参考他的答案
这仅仅是因为每次在每个折叠中构建新模型时,keras都会进行随机权重初始化。只需在顶部设置一次种子,下面的代码就可以重现,但取决于执行顺序
要使结果相同,您只需在每次安装新的折叠时初始化相同的种子。我们在
create_model
函数的顶部执行此操作,并使用它手动操作CV和KerasRegressor
加上cross_val_score
(从sklearn
)初始化一些虚拟数据
手工简历
结果:
sklearn
CV结果:
here跑步笔记本
这仅对CPU有效
相关问题 更多 >
编程相关推荐