两个等效网络的形状不匹配?

2024-09-30 20:37:29 发布

您现在位置:Python中文网/ 问答频道 /正文

所以,答案可能是“这两个网络是不等价的”,显然我遗漏了一些东西,但在我的理解中,它们应该做同样的事情,但中间输出的维度是不一样的。我的主要问题是pytorch ConvTranspose3dConv3d。特别是,我有一个shape (1,44,68,120)的输入,其中44是深度维度,68和120是宽度和高度。我使用跨步conv/convtranspose在维度上向下/向上移动。这不应该:

self.conv3d1_1 = nn.Conv3d(in_channels=1, out_channels=32, groups=1, stride=(2,2,2), kernel_size=3, padding=1)

在输出维度上是否等同于这两层?你知道吗

self.conv3d1_1 = nn.Conv3d(in_channels=1, out_channels=32, groups=1, stride=(2,1,1), kernel_size=3, padding=1)
self.conv3d1_2 = nn.Conv3d(in_channels=32, out_channels=32, groups=1, stride=(1,2,2), kernel_size=3, padding=1)

第一层同时将所有维度减半,而第二层先降低时间维度,然后降低空间维度?你知道吗

那么,第一网络:

    self.conv3d1_1 = nn.Conv3d(in_channels=1, out_channels=32, groups=1, stride=(2,2,2), kernel_size=3, padding=1)  # depthwise convolution (1 ch, 44 depth dimension, h, w)
    self.conv3d1_2 = nn.Conv3d(in_channels=32, out_channels=64, groups=1, kernel_size=3, padding=1)

    self.conv3d2_1 = nn.Conv3d(in_channels=64, out_channels=128, groups=1, stride=(2, 2, 2), kernel_size=3, padding=1)
    self.conv3d2_2 = nn.Conv3d(in_channels=128, out_channels=128, groups=1, kernel_size=3, padding=1)

    self.conv3d3_1 = nn.ConvTranspose3d(in_channels=128, out_channels=128, stride=(2, 2, 2), groups=1, kernel_size=4, padding=(1,1,1))
    self.conv3d3_2 = nn.Conv3d(in_channels=128, out_channels=128, groups=1, kernel_size=3, padding=1)

    self.conv3d4_1 = nn.ConvTranspose3d(in_channels=128, out_channels=128, stride=(2, 2, 2), groups=1, kernel_size=4, padding=1)
    self.conv3d4_2 = nn.Conv3d(in_channels=128, out_channels=64, groups=1, kernel_size=3, padding=1)

生成这些中间尺寸:

torch.Size([2, 1, 44, 68, 120])
torch.Size([2, 32, 22, 34, 60])
torch.Size([2, 64, 22, 34, 60])
torch.Size([2, 128, 11, 17, 30])
torch.Size([2, 128, 11, 17, 30])
torch.Size([2, 128, 22, 34, 60])
torch.Size([2, 128, 22, 34, 60])
torch.Size([2, 128, 44, 68, 120])
torch.Size([2, 64, 44, 68, 120])

一切看起来都很好。你知道吗

第二个网络(只有1个维度减少以保持它的简短,但同样的,最坏的情况发生在下降2倍(所以4倍))

    elf.conv3d1_1 = nn.Conv3d(in_channels=1, out_channels=32, groups=1, stride=(2,1,1), kernel_size=3, padding=1)  # depthwise convolution (1 ch, 44 depth dimension, h, w)
    self.conv3d1_2 = nn.Conv3d(in_channels=32, out_channels=64, groups=1, kernel_size=3, padding=1)

    self.conv3d2_1 = nn.Conv3d(in_channels=64, out_channels=128, groups=1, stride=(1, 2, 2), kernel_size=3, padding=1)
    self.conv3d2_2 = nn.Conv3d(in_channels=128, out_channels=128, groups=1, kernel_size=3, padding=1)

    self.conv3d3_1 = nn.ConvTranspose3d(in_channels=128, out_channels=128, stride=(1, 2, 2), groups=1, kernel_size=4, padding=(1,1,1))
    self.conv3d3_2 = nn.Conv3d(in_channels=128, out_channels=128, groups=1, kernel_size=3, padding=1)

    self.conv3d4_1 = nn.ConvTranspose3d(in_channels=128, out_channels=128, stride=(2, 1, 1), groups=1, kernel_size=4, padding=1)
    self.conv3d4_2 = nn.Conv3d(in_channels=128, out_channels=64, groups=1, kernel_size=3, padding=1)

这些是中间尺寸:

torch.Size([2, 1, 44, 68, 120])
torch.Size([2, 32, 22, 68, 120])
torch.Size([2, 64, 22, 68, 120])
torch.Size([2, 128, 22, 34, 60])
torch.Size([2, 128, 22, 34, 60])
torch.Size([2, 128, 23, 68, 120])
torch.Size([2, 128, 23, 68, 120])
torch.Size([2, 128, 46, 69, 121])
torch.Size([2, 64, 46, 69, 121])

出于某种原因,第一个ConvTranspose3d增加了时间维度,而它应该只在空间维度上工作?我最初认为这是一个填充问题,但改变填充并不能解决问题。时间ConvTranspose3d也是如此,它将空间维度增加1。你知道吗

有什么线索吗?提前谢谢。你知道吗


Tags: inselfsizenntorchoutkernelgroups