TIMM中的DropPath看起来像个辍学者？

def drop_path(x, drop_prob: float = 0., training: bool = False): """Drop paths (Stochastic Depth) per sample (when applied in main path of residual blocks). This is the same as the DropConnect impl I created for EfficientNet, etc networks, however, the original name is misleading as 'Drop Connect' is a different form of dropout in a separate paper... See discussion: https://github.com/tensorflow/tpu/issues/494#issuecomment-532968956 ... I've opted for changing the layer and argument names to 'drop path' rather than mix DropConnect as a layer name and use 'survival rate' as the argument. """ if drop_prob == 0. or not training: return x keep_prob = 1 - drop_prob shape = (x.shape[0],) + (1,) * (x.ndim - 1) # work with diff dim tensors, not just 2D ConvNets random_tensor = keep_prob + torch.rand(shape, dtype=x.dtype, device=x.device) random_tensor.floor_() # binarize output = x.div(keep_prob) * random_tensor return output

1条回答

网友

1楼 · 发布于 2024-09-22 16:31:41

不，它与^{}不同：

import torch
from torch.nn.functional import dropout

torch.manual_seed(2021)

def drop_path(x, drop_prob: float = 0., training: bool = False):
    if drop_prob == 0. or not training:
        return x
    keep_prob = 1 - drop_prob
    shape = (x.shape[0],) + (1,) * (x.ndim - 1)
    random_tensor = keep_prob + torch.rand(shape, dtype=x.dtype, device=x.device)
    random_tensor.floor_()  # binarize
    output = x.div(keep_prob) * random_tensor
    return output

x = torch.rand(3, 2, 2, 2)

# DropPath
d1_out = drop_path(x, drop_prob=0.33, training=True)

# Dropout
d2_out = dropout(x, p=0.33, training=True)

让我们比较输出（为了可读性，我删除了通道之间的换行符）：

# DropPath
print(d1_out)
#  tensor([[[[0.1947, 0.7662],
#            [1.1083, 1.0685]],
#           [[0.8515, 0.2467],
#            [0.0661, 1.4370]]],
#
#          [[[0.0000, 0.0000],
#            [0.0000, 0.0000]],
#           [[0.0000, 0.0000],
#            [0.0000, 0.0000]]],
#
#          [[[0.7658, 0.4417],
#            [1.1692, 1.1052]],
#           [[1.2014, 0.4532],
#            [1.4840, 0.7499]]]])

# Dropout
print(d2_out)
#  tensor([[[[0.1947, 0.7662],
#            [1.1083, 1.0685]],
#           [[0.8515, 0.2467],
#            [0.0661, 1.4370]]],
#
#          [[[0.0000, 0.1480],
#            [1.2083, 0.0000]],
#           [[1.2272, 0.1853],
#            [0.0000, 0.5385]]],
#
#          [[[0.7658, 0.0000],
#            [1.1692, 1.1052]],
#           [[1.2014, 0.4532],
#            [0.0000, 0.7499]]]])

正如你所看到的，它们是不同的DropPath是从批次中滴下整个样品，当在其{a2}的等式2中使用时，有效地导致随机深度。另一方面，Dropout正在按预期（从docs）删除随机值：

During training, randomly zeroes some of the elements of the input tensor with probability p using samples from a Bernoulli distribution. Each channel will be zeroed out independently on every forward call.

还请注意，这两种方法都基于概率来缩放输出值，即，对于相同的p，未调零的元素是相同的

相关问题更多 >

编程相关推荐

热门问题

热门文章