Nan失去keras Resnet50

2024-10-03 17:24:20 发布

您现在位置:Python中文网/ 问答频道 /正文

验证损失为nan,但培训损失很好

我该怎么解决呢

我已经确认数据集中没有nan值

from tensorflow import keras

base_model = keras.applications.resnet50.ResNet50(include_top = False, weights='imagenet')

for layer in base_model.layers:
    layer.trainable = False

avg = keras.layers.GlobalAveragePooling2D(name="global_avg")(base_model.output)
output = keras.layers.Dense(1, activation = 'sigmoid', name = "predictions")(avg)
model = keras.Model(inputs = base_model.input, outputs = output, name = "ResNet-50")

optimizer = keras.optimizers.SGD(lr=0.01, momentum=0.9, decay=0.0001, clipnorm = 0.1)
reduce_LROP = keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=10, verbose=0, mode='auto',
    min_delta=0.0001, cooldown=0, min_lr=0)
model.compile(loss=tf.keras.losses.BinaryCrossentropy(), optimizer = optimizer, metrics = ['accuracy'])

history = model.fit(tri, y_train, epochs = 10, batch_size = 32, validation_data = (vai, y_val),
                    callbacks = [reduce_LROP])

enter image description here


Tags: namelayerfalsereduceoutputbasemodellayers
1条回答
网友
1楼 · 发布于 2024-10-03 17:24:20

我买了GIGABYTE RTX 3080 gaming oc 10GB用于深度学习,并用它来训练一个模型

我在4个环境中测试了相同的脚本:

  1. 3700x+RTX 3080(CUDA 10.1)
  2. 仅3700x(无GPU)
  3. 其他笔记本电脑(i7 8750H+GTX 1050ti)
  4. 3700x+RTX 3080(CUDA 11.0+cudnn 8.0.3)

除第1个环境外,验证损失很好

使用Tensorflow nightly build和CUDA 11.0解决了我的问题。

相关问题 更多 >