KeyError: val_loss when training mod

2024-10-01 07:50:24 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在用keras训练一个模型,在fit_生成器函数的回调中出现错误。我总是跑到第三纪元,得到这个错误

annotation_path = 'train2.txt'
    log_dir = 'logs/000/'
    classes_path = 'model_data/deplao_classes.txt'
    anchors_path = 'model_data/yolo_anchors.txt'
    class_names = get_classes(classes_path)
    num_classes = len(class_names)
    anchors = get_anchors(anchors_path)

    input_shape = (416,416) # multiple of 32, hw

    is_tiny_version = len(anchors)==6 # default setting
    if is_tiny_version:
        model = create_tiny_model(input_shape, anchors, num_classes,
            freeze_body=2, weights_path='model_data/tiny_yolo_weights.h5')
    else:
        model = create_model(input_shape, anchors, num_classes,
            freeze_body=2, weights_path='model_data/yolo_weights.h5') # make sure you know what you freeze

    logging = TensorBoard(log_dir=log_dir)
    checkpoint = ModelCheckpoint(log_dir + 'ep{epoch:03d}-loss{loss:.3f}-val_loss{val_loss:.3f}.h5',
        monitor='val_loss', save_weights_only=True, save_best_only=True, period=3)

    reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=3, verbose=1)
    early_stopping = EarlyStopping(monitor='val_loss', min_delta=0, patience=10, verbose=1)


[error]
Traceback (most recent call last):
  File "train.py", line 194, in <module>
    _main()
  File "train.py", line 69, in _main
    callbacks=[logging, checkpoint])
  File "C:\Users\ilove\AppData\Roaming\Python\Python37\lib\site-packages\keras\legacy\interfaces.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\ilove\AppData\Roaming\Python\Python37\lib\site-packages\keras\engine\training.py", line 1418, in fit_generator
    initial_epoch=initial_epoch)
  File "C:\Users\ilove\AppData\Roaming\Python\Python37\lib\site-packages\keras\engine\training_generator.py", line 251, in fit_generator
    callbacks.on_epoch_end(epoch, epoch_logs)
  File "C:\Users\ilove\AppData\Roaming\Python\Python37\lib\site-packages\keras\callbacks.py", line 79, in on_epoch_end
    callback.on_epoch_end(epoch, logs)
  File "C:\Users\ilove\AppData\Roaming\Python\Python37\lib\site-packages\keras\callbacks.py", line 429, in on_epoch_end
    filepath = self.filepath.format(epoch=epoch + 1, **logs)
KeyError: 'val_loss'

谁能找出问题来帮助我吗?在

提前谢谢你的帮助。在


Tags: pathinpymodellinevalusersclasses
2条回答

这个答案不适用于这个问题,但这是在Google搜索结果的顶端keras "KeyError: 'val_loss'",所以我将分享我的问题的解决方案。在

错误对我来说也是一样的:当在检查点文件名中使用val_loss时,我会得到以下错误:KeyError: 'val_loss'。我的检查点也在监视这个字段,所以即使我从文件名中去掉了这个字段,我仍然会从检查点得到这个警告:WARNING:tensorflow:Can save best model only with val_loss available, skipping.

在我的例子中,问题是我从分别使用Keras和Tensorflow 1升级到使用tensorflow2附带的Keras。ModelCheckpointperiod参数已替换为save_freq。我错误地假设save_freq的行为方式相同,所以我将其设置为save_freq=1,认为这样可以保存每一部史诗。但是,docs状态:

save_freq: 'epoch' or integer. When using 'epoch', the callback saves the model after each epoch. When using integer, the callback saves the model at end of a batch at which this many samples have been seen since last saving. Note that if the saving isn't aligned to epochs, the monitored metric may potentially be less reliable (it could reflect as little as 1 batch, since the metrics get reset every epoch). Defaults to 'epoch'

设置save_freq='epoch'为我解决了这个问题。注意:OP仍然在使用period=1,所以这绝对不是导致他们问题的原因

此回调在迭代3结束时运行。在

    checkpoint = ModelCheckpoint(log_dir + 'ep{epoch:03d}-loss{loss:.3f}-val_loss{val_loss:.3f}.h5',
        monitor='val_loss', save_weights_only=True, save_best_only=True, period=3)

错误消息声明在执行以下操作时,logs变量中没有valu丢失:

^{pr2}$

如果在没有验证数据的情况下调用fit,则会发生这种情况。在

我将首先简化模型检查点的路径名。在名字里加上纪元就足够了。在

相关问题 更多 >