Keras自定义损失函数在语义分割过程中忽略特定类的错误否定?

2024-10-02 00:38:30 发布

您现在位置:Python中文网/ 问答频道 /正文

请参见下面的编辑,最初的帖子现在几乎没有任何意义,但问题仍然存在


我开发了一个神经网络来对图像进行语义分割。我研究过各种损失函数(分类交叉熵(CCE)、权重CCE、焦点损失、tversky损失、jaccard损失、焦点tversky损失等),这些函数试图处理高度倾斜的类表示,但没有一个能产生预期的效果。我的顾问提到试图创建一个自定义损失函数,该函数忽略特定类的误报(但仍然会惩罚误报)

我有一个6级的问题,我的网络设置为使用一个热编码的真相数据。因此,我的损失函数将接受两个y_true, y_pred形的张量(batch, row, col, class)(当前为(8, 128, 128, 6))。为了能够利用我已经探讨过的损失,我想改变y_pred,将特定类别(第0个类别)的预测值设置为始终正确。这就是y_true == class 0设置y_pred == class 0的地方,否则什么也不做

由于tensorflow张量是不可变的,我花了太多时间试图创建这个损失函数。我的第一次尝试(这是通过我在numpy方面的经验得出的)

def weighted_categorical_crossentropy_ignore(weights):
    weights = K.variable(weights)

    def loss(y_true, y_pred):
        y_pred[tf.where(y_true == [1, 0, 0, 0, 0, 0])] = [1, 0, 0, 0, 0, 0]

        # Scale predictions so that the class probs of each sample sum to 1
        y_pred /= K.sum(y_pred, axis=-1, keepdims=True)
        # Clip to prevent NaN's and Inf's
        y_pred = K.clip(y_pred, K.epsilon(), 1 - K.epsilon())
        loss = y_true * K.log(y_pred) * weights
        loss = -K.sum(loss, -1)
        return loss
    return loss

虽然显然我不能改变y_pred,所以这次尝试失败了。我最终创建了一些怪物,试图通过迭代[batch, row, col]并执行比较来“构建”张量。虽然这种(ese)尝试在技术上并没有失败,但实际上他们从未开始训练。我想计算损失大概需要几分钟


在多次失败的努力之后,我开始尝试在SSCCE中的纯numpy中执行必要的计算。但要知道,我基本上只限于实例化“简单”张量(即oneszeros),并且只执行“简单”操作,如按元素的乘法、加法和整形。我就这样来到了这个地方

import numpy as np
from tensorflow.keras.utils import to_categorical

# Generate the "images" at random
true_flat = np.argmax(np.random.rand(1, 2, 2, 4), axis=3).astype('int')
true = to_categorical(true_flat, num_classes=4).astype('int')

pred_flat = np.argmax(np.random.rand(1, 2, 2, 4), axis=3).astype('int')
pred = to_categorical(pred_flat, num_classes=4).astype('int')

print('True:\n', true_flat)
print('Pred:\n', pred_flat)

# Create a mask representing an all "class 0" image
class_zero_label = np.array([1, 0, 0, 0])
czl_all = class_zero_label * np.ones(true.shape).astype('int')

# Mask both the truth and pred to locate class 0 pixels
czl_true_locs = czl_all * true
czl_pred_locs = czl_all * pred

# Subtract to create "addition" matrix
a  = (czl_true_locs - czl_pred_locs) * czl_true_locs
print('a:\n', a)

# Do this
m = ((a + 1) - (a * 2))
print('m - ', m.shape, ':\n', m)

# Pull the front entry from 'm' and "expand" its value
#x = (m[:, :, :, 0].flatten() * np.ones(pred.shape).astype('int')).T.reshape(pred.shape)
m_front = m[:, :, :, 0]
print('m_front - ', m_front.shape, ':\n', m_front)

#m_flat = m_front.flatten()
m_flat = m_front.reshape(m_front.shape[0], m_front.shape[1]*m_front.shape[2])
print('m_flat - ', m_flat.shape, ':\n', m_flat)

m_expand = m_flat * np.ones(pred.shape).astype('int')
print('m_expand - ', m_expand.shape, ':\n', m_expand)

m_trans = m_expand.T
m_fixT = m_trans.reshape(pred.shape)
print('m_fixT - ', m_fixT.shape, ':\n', m_fixT)

m = m_fixT
print('m:\n', m.shape)

# Perform the math as described
pred = (pred * m) + a
print('Pred:\n', np.argmax(pred, axis=3))

这个SSCE,是好的,可怕的和复杂的。基本上,我的目标是创建两个矩阵,“加法”和“乘法”矩阵。乘法矩阵意味着“零出”预测值中的每个像素,其中真值等于0级。这与像素值(即一个热编码向量)为零等于[0, 0, 0, 0, 0, 0]无关。然后,加法矩阵意味着将向量[1, 0, 0, 0, 0, 0]添加到每个零输出位置。最终,这将实现将每个真正的0类像素的预测值设置为正确的目标

问题在于该SSCCE不能完全转化为tensorflow操作。第一个问题是乘法矩阵的生成,它在batch_size > 1时没有正确定义。我想没关系,只是为了看看它是否有效,我将分解tf.unstacky_truey_pred张量,并对它们进行迭代。这让我找到了丢失函数的当前实例

def weighted_categorical_crossentropy_ignore(weights):
    weights = K.variable(weights)

    def loss(y_true, y_pred):

        y_true_un = tf.unstack(y_true)
        y_pred_un = tf.unstack(y_pred)

        y_pred_new = []
        for i in range(0, y_true.shape[0]):
            yt = y_true_un[i]
            yp = y_pred_un[i]

            # Pred:
            # [[[0 3] * [[[1 0] + [[[0 1] = [[[0 0]
            #  [3 1]]]   [[1 1]]]  [[0 0]]]  [[3 1]]]
            # If we multiple pred by a tensor which zeros out only incorrect class 0 labelleling
            # Then add class zero to those zero'd out locations
            # We can negate the effect of mis-classified class 0 pixels but still punish for
            # incorrectly predicted class 0 labels for other classes.

            # Create a mask respresenting an all "class 0" image
            class_zero_label = K.variable([1.0, 0.0, 0.0, 0.0, 0.0, 0.0])
            czl_all = class_zero_label * K.ones(yt.shape)

            # Mask both true and pred to locate class 0 pixels
            czl_true = czl_all * yt
            czl_pred = czl_all * yp

            # Subtract to create "addition matrix"
            a = czl_true - czl_pred

            # Do this.
            m = ((a + 1) - (a * 2.))

            # And this.
            x = K.flatten(m[:, :, 0])
            x = x * K.ones(yp.shape)
            x = K.transpose(x)
            x = K.reshape(x, yp.shape)

            # Voila.
            ypnew = (yp * x) + a

            y_pred_new.append(ypnew)

        y_pred_new = tf.concat(y_pred_new, 0)


        # Continue calculating weighted categorical crossentropy
        # -------------------------------------------------------

        # Scale predictions so that the class probs of each sample sum to 1
        y_pred_new /= K.sum(y_pred_new, axis=-1, keepdims=True)
        # Clip to prevent NaN's and Inf's
        y_pred_new = K.clip(y_pred_new, K.epsilon(), 1 - K.epsilon())
        loss = y_true * K.log(y_pred_new) * weights
        loss = -K.sum(loss, -1)
        return loss
    return loss

此丢失函数当前的问题在于执行操作时numpytensorflow之间行为的明显差异

x = K.flatten(m[:, :, 0])
x = x * K.ones(yp.shape)

这是用来表示行为的

m_flat = m_front.flatten()
m_expand = m_flat * np.ones(pred.shape).astype('int')

来自SSCCE


所以在这一点上,我觉得我已经深入研究了穴居人编码,我无法摆脱它。我必须想象有一种简单的方式类似于我最初尝试执行所描述的行为

因此,我想我的直接问题是如何实施

y_pred[tf.where(y_true == [1, 0, 0, 0, 0, 0])] = [1, 0, 0, 0, 0, 0]

在自定义tensorflow损耗函数中?


编辑:在摸索了很多之后,我终于决定如何调用{}、{}张量上的{}来利用{}操作(显然在程序开始时设置{}“不起作用”。我必须将{}传递给{})

这让我得以实现essent这是第一次尝试

def weighted_categorical_crossentropy_ignore(weights):
    weights = K.variable(weights)

    def loss(y_true, y_pred):
        yp = y_pred.numpy()
        yt = y_true.numpy()
        yp[np.nonzero(np.all(yt == [1, 0, 0, 0, 0, 0], axis=3))] = [1, 0, 0, 0, 0, 0]
 
        # Continue calculating weighted categorical crossentropy
        # -------------------------------------------------------
        # Scale predictions so that the class probs of each sample sum to 1
        yp /= K.sum(yp, axis=-1, keepdims=True)
        # Clip to prevent NaN's and Inf's
        yp = K.clip(yp, K.epsilon(), 1 - K.epsilon())
        loss = y_true * K.log(yp) * weights
        loss = -K.sum(loss, -1)
        return loss
    return loss

尽管通过调用y_pred.numpy()(或者在此后使用它),我似乎已经“破坏”了通过网络的路径/流。基于尝试.fit时的错误

ValueError: No gradients provided for any variable: ['conv3d/kernel:0', <....>

我想我需要将张量“remarshall”回GPU内存?我试过了

yp = tf.convert_to_tensor(yp)

无济于事;同样的错误。所以我猜同样的问题仍然存在,但动机不同


EDIT2:从这个SO Answer看来,我实际上不能使用numpy()来马歇尔y_truey_pred来使用普通的numpy操作。这必然会“破坏”网络路径,因此无法计算梯度

因此,我意识到使用run_eagerly=True我可以tf.Variable我的y_true/y_pred并执行赋值。因此,在纯tensorflow中,我尝试再次创建相同的代码

def weighted_categorical_crossentropy_ignore(weights):
    weights = K.variable(weights)

    def loss(y_true, y_pred):
        # yp = y_pred.numpy().copy()
        # yt = y_true.numpy().copy()
        # yp[np.nonzero(np.all(yt == [1, 0, 0, 0, 0, 0], axis=3))] = [1, 0, 0, 0, 0, 0]

        yp = K.variable(y_pred)
        yt = K.variable(y_true)
        #np.all
        x = K.all(yt == [1, 0, 0, 0, 0, 0], axis=3)
        #np.nonzero
        ne = tf.not_equal(x, tf.constant(False))
        y = tf.where(ne)

        # Perform the desired operation
        yp[y] = [1, 0, 0, 0, 0, 0]

        # Continue calculating weighted categorical crossentropy
        # -------------------------------------------------------
        # Scale predictions so that the class probs of each sample sum to 1
        #yp /= K.sum(yp, axis=-1, keepdims=True) # Cannot use \= on tf.var, must use var = var /
        yp = yp / K.sum(yp, axis=-1, keepdims=True)
        # Clip to prevent NaN's and Inf's
        yp = K.clip(yp, K.epsilon(), 1 - K.epsilon())
        loss = y_true * K.log(yp) * weights
        loss = -K.sum(loss, -1)
        return loss
    return loss

但遗憾的是,这显然造成了与调用^{时相同的问题;无法计算任何渐变。看来我又回到了原点1


EDIT3:使用建议的解决方案by gobrewers14 in the answer posted below,但根据我对问题的了解进行了修改,我生成了这个损失函数

def weighted_categorical_crossentropy_ignore(weights):
    weights = K.variable(weights)

    def loss(y_true, y_pred):
        print('y_true.shape: ', y_true.shape)
        print('y_pred.shape: ', y_pred.shape)

        # Generate modified y_pred where all truly class0 pixels are correct
        y_true_class0_indicies = tf.where(tf.math.equal(y_true, [1., 0., 0., 0., 0., 0.]))
        y_pred_updates = tf.repeat([
            [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]],
            repeats=y_true_class0_indicies.shape[0],
            axis=0)
        yp = tf.tensor_scatter_nd_update(y_pred, y_true_class0_indicies, y_pred_updates)

        # Continue calculating weighted categorical crossentropy
        # -------------------------------------------------------
        # Scale predictions so that the class probs of each sample sum to 1
        yp /= K.sum(yp, axis=-1, keepdims=True)
        # Clip to prevent NaN's and Inf's
        yp = K.clip(yp, K.epsilon(), 1 - K.epsilon())
        loss = y_true * K.log(yp) * weights
        loss = -K.sum(loss, -1)
        return loss
    return loss

假设原始答案假定y_true[8, 128, 128]形状(即“扁平”类表示,而不是一个热编码表示[8, 128, 128, 6]),我首先打印y_truey_pred输入张量的形状,以确保完整性

y_true.shape:  (8, 128, 128, 6)
y_pred.shape:  (8, 128, 128, 6)

为了进一步完善,由model.summary的尾部提供的网络输出形状是

conv2d_18 (Conv2D)              (None, 128, 128, 6)  1542        dropout_5[0][0]                  
__________________________________________________________________________________________________
activation_9 (Activation)       (None, 128, 128, 6)  0           conv2d_18[0][0]                  
==================================================================================================
Total params: 535,551,494
Trainable params: 535,529,478
Non-trainable params: 22,016
__________________________________________________________________________________________________

然后,我遵循建议的解决方案中的“模式”,用tf.math.equal(y_true, [1., 0., 0., 0., 0., 0.])替换原始的tf.math.equal(y_true, 0),以处理一个热编码的情况。根据我目前对提议的解决方案的理解(在大约10分钟的检查之后),我认为这应该是可行的。虽然尝试训练模型时会引发以下异常

InvalidArgumentError: Inner dimensions of output shape must match inner dimensions of updates shape. Output: [8,128,128,6] updates: [684584,6] [Op:TensorScatterUpdate]

因此,似乎(正如我所命名的)的产生y_pred_updates产生了一个包含“太多”元素的“折叠”张量。我理解使用tf.repeat的动机,但它的具体用法似乎不正确。根据我所理解的tf.tensor_scatter_nd_update,我假设它应该产生一个(8, 128, 128, 6)形状的张量。我假设这很可能是基于调用tf.repeat期间repeatsaxis的选择


Tags: totruetfnpallclasssumshape
1条回答
网友
1楼 · 发布于 2024-10-02 00:38:30

如果我正确理解了您的问题,您正在寻找以下内容:

import tensorflow as tf


# batch of true labels
y_true = tf.constant([5, 0, 1, 3, 4, 0, 2, 0], dtype=tf.int64)

# batch of class probabilities
y_pred = tf.constant(
  [
    [0.34670502, 0.04551039, 0.14020428, 0.14341979, 0.21430719, 0.10985339],
    [0.25681055, 0.14013883, 0.19890164, 0.11124421, 0.14526634, 0.14763844],
    [0.09199252, 0.21889475, 0.1170236 , 0.1929019 , 0.20311192, 0.17607528],
    [0.3246354 , 0.23257554, 0.15549366, 0.17282239, 0.00000001, 0.11447308],
    [0.16502093, 0.13163856, 0.14371352, 0.19880624, 0.23360236, 0.12721846],
    [0.27362782, 0.21408406, 0.10917682, 0.13135742, 0.10814326, 0.16361059],
    [0.20697299, 0.23721898, 0.06455399, 0.11071447, 0.18990229, 0.19063729],
    [0.10320242, 0.22173141, 0.2547973 , 0.2314068 , 0.07063974, 0.11822232]
  ], dtype=tf.float32)

# find the indices in the batch where the true label is the class 0
indices = tf.where(tf.math.equal(y_true, 0))

# create a tensor with the number of updates you want to replace in `y_pred`
updates = tf.repeat(
    [[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]],
    repeats=indices.shape[0],
    axis=0)

# insert the updates into `y_pred` at the specified indices
modified_y_pred = tf.tensor_scatter_nd_update(y_pred, indices, updates)

print(modified_y_pred)
# tf.Tensor(
#   [[0.34670502, 0.04551039, 0.14020428, 0.14341979, 0.21430719, 0.10985339],
#    [1.00000000, 0.00000000, 0.00000000, 0.00000000, 0.00000000, 0.00000000],
#    [0.09199252, 0.21889475, 0.1170236 , 0.1929019 , 0.20311192, 0.17607528],
#    [0.3246354 , 0.23257554, 0.15549366, 0.17282239, 0.00000001, 0.11447308],
#    [0.16502093, 0.13163856, 0.14371352, 0.19880624, 0.23360236, 0.12721846],
#    [1.00000000, 0.00000000, 0.00000000, 0.00000000, 0.00000000, 0.00000000],
#    [0.20697299, 0.23721898, 0.06455399, 0.11071447, 0.18990229, 0.19063729],
#    [1.00000000, 0.00000000, 0.00000000, 0.00000000, 0.00000000, 0.00000000]], 
#    shape=(8, 6), dtype=tf.float32)

这个最后的张量modified_y_pred可以用于微分

编辑:

使用面具可能更容易做到这一点

例如:

# these arent normalized to 1 but you get the point
probs = tf.random.normal([2, 4, 4, 6])

# raw labels per pixel
labels = tf.random.uniform(
    shape=[2, 4, 4],
    minval=0,
    maxval=6,
    dtype=tf.int64)

# your labels are already one-hot encoded
labels = tf.one_hot(labels, 6)

# boolean mask where classes are `0`
# converting back to int labels with argmax for purposes of
# using `tf.math.equal`. Matching on `[1, 0, 0, 0, 0, 0]` is
# potentially buggy; matching on an integer is a lot more
# explicit.
mask = tf.math.equal(tf.math.argmax(labels, -1), 0)[..., None]

# flip the mask to zero out the pixels across channels where
# labels are zero
probs *= tf.cast(tf.math.logical_not(mask), tf.float32)

# multiply the mask by the one-hot labels, and add back
# to the already masked probabilities.
probs += labels * tf.cast(mask, tf.float32)

相关问题 更多 >

    热门问题