多实验室文本分类

1条回答

网友

1楼 · 发布于 2024-09-28 01:24:01

标签编码似乎正确。如果有多个正确的标签，[1 0 1 0 ... 1]看起来很好。Denny的post中使用的损失函数是tf.nn.softmax_cross_entropy_with_logits，这是一个多类问题的损失函数。在

Computes softmax cross entropy between logits and labels.
Measures the probability error in discrete classification tasks in which the classes are mutually exclusive (each entry is in exactly one class).

在多标签问题中，您应该使用tf.nn.sigmoid_cross_entropy_with_logits：

Computes sigmoid cross entropy given logits.
Measures the probability error in discrete classification tasks in which each class is independent and not mutually exclusive. For instance, one could perform multilabel classification where a picture can contain both an elephant and a dog at the same time.

loss函数的输入是logits（WX）和目标（labels）。在

修正精度测量值

为了正确测量多标签问题的精度，需要更改以下代码。在

# Calculate Accuracy
with tf.name_scope("accuracy"):
    correct_predictions = tf.equal(self.predictions, tf.argmax(self.input_y, 1))
    self.accuracy = tf.reduce_mean(tf.cast(correct_predictions, "float"), name="accuracy")

当您可以有多个正确的标签时，上面correct_predictions的逻辑是不正确的。例如，假设num_classes=4，标签0和2是正确的。因此，您的input_y=[1, 0, 1, 0].，correct_predictions需要打破索引0和索引2之间的联系。我不确定tf.argmax是如何打破平局的，但如果它通过选择较小的索引打破平局，标签2的预测总是被认为是错误的，这肯定会损害您的准确度度量。在

实际上在多标签问题中，precision and recall是比准确度更好的度量。也可以考虑使用精度@k（tf.nn.in_top_k）报告分类器性能。在

修正精度测量值

相关问题更多 >

编程相关推荐

热门问题

热门文章