同时使用样品重量和类别重量

2条回答

网友
1楼 · 编辑于 2024-05-21 13:25:51

对于那些实际上需要同时使用类权重和样本权重的人来说，对于DarkCygnus的答案来说：
下面是一个代码，我用它来生成样本权重，以便按序列对多类时态数据进行分类：
（targets是一个维度[#时间，#categories]的数组，值在set中（#classes），class_weights是[#categories，#classes]）。
生成的序列与目标数组的长度相同，批处理中常用的情况是用零填充目标，样本权重也达到相同的大小，从而使网络忽略填充的数据。在
def multiclass_temoral_class_weights(targets, class_weights): s_weights = np.ones((targets.shape[0],)) # if we are counting the classes, the weights do not exist yet! if class_weights is not None: for i in range(len(s_weights)): weight = 0.0 for itarget, target in enumerate(targets[i]): weight += class_weights[itarget][int(round(target))] s_weights[i] = weight return s_weights

网友
2楼 · 编辑于 2024-05-21 13:25:51

如果你想的话，你当然可以同时做这两件事，关键是你是否需要。根据keras docs：
class_weight: Optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function (during training only). This can be useful to tell the model to "pay more attention" to samples from an under-represented class.
sample_weight: Optional Numpy array of weights for the training samples, used for weighting the loss function (during training only). You can either pass a flat (1D) Numpy array with the same length as the input samples (1:1 mapping between weights and samples), or in the case of temporal data [...].
因此，鉴于您提到您的“比第二个”“拥有更多的第一个类，我认为您应该使用class_weight参数。在这里，您可以指示数据集所呈现的比率，以便可以补偿不平衡的数据类。当您想为每个数据元素定义权重或重要性时，sample_weight会更多。在
例如，如果您通过：
class_weight = {0 : 1. , 1: 50.}
您将说来自类1的每个样本都将被计为来自类0的50个样本，因此给来自类1的元素更多的“重要性”（因为这些样本肯定更少）。你可以根据自己的需要定制这个。更多关于this的不平衡数据集的信息。在
注意：要进一步比较这两个参数，请记住将class_weight作为{0:1., 1:50.}传递将等效于将sample_weight作为{}传递，前提是您有一个示例，其类[0,0,0,...,1,1,...]。在
正如我们所看到的，在这种情况下使用class_weight更实际，sample_weight可以用于更具体的情况，在这种情况下，您实际上想单独给每个样本一个“重要性”。如果情况需要，也可以同时使用这两种方法，但必须记住其累积效应。在
编辑：根据您的新问题，在Keras上挖掘source code似乎sample_weights确实覆盖了class_weights，下面是对_standarize_weigths方法（第499行）执行此操作的代码片段：
^{pr2}$
这意味着您只能使用其中一个，但不能同时使用这两个。因此，您确实需要将您的sample_weights乘以您需要补偿不平衡的比率。在

相关问题更多 >

编程相关推荐

热门问题

热门文章