如何计算MaxPooling2D、Conv2D、UpSampling2D层的输出大小?

2024-09-30 12:20:34 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在学习卷积自动编码器,我正在使用keras来构建图像去噪器。 以下代码适用于构建模型:

denoiser.add(Conv2D(32, (3,3), input_shape=(28,28,1), padding='same')) 
denoiser.add(Activation('relu'))
denoiser.add(MaxPooling2D(pool_size=(2,2)))

denoiser.add(Conv2D(16, (3,3), padding='same'))
denoiser.add(Activation('relu'))
denoiser.add(MaxPooling2D(pool_size=(2,2)))

denoiser.add(Conv2D(8, (3,3), padding='same'))
denoiser.add(Activation('relu'))

################## HEY WHAT NO MAXPOOLING?

denoiser.add(Conv2D(8, (3,3), padding='same'))
denoiser.add(Activation('relu'))
denoiser.add(UpSampling2D((2,2)))

denoiser.add(Conv2D(16, (3,3), padding='same'))
denoiser.add(Activation('relu'))
denoiser.add(UpSampling2D((2,2)))

denoiser.add(Conv2D(1, (3,3), padding='same'))

denoiser.compile(optimizer='adam', loss='mean_squared_error', metrics=['accuracy'])
denoiser.summary()

现总结如下:

^{pr2}$

我不确定如何计算MaxPooling2DConv2DUpSampling2D输出大小。我已经阅读了keras文档,但是我仍然感到困惑。有很多参数会影响输出形状,比如Conv2D层的stridepadding,我不知道它究竟如何影响输出形状。在

我不明白为什么在注释行之前没有MaxPooling2D层。编辑代码以在注释上方包含convmodel3.add(MaxPooling2D(pool_size=(2,2)))层,它将最终输出形状转换为(None,12,12,1)

编辑代码使其在注释之前包含convmodel3.add(MaxPooling2D(pool_size=(2,2)))层,然后convmodel3.add(UpSampling2D((2,2)))将最终输出转换为(None,24,24,1)。这不是一个(无,28,28,1)吗? 代码和摘要:

convmodel3 = Sequential()
convmodel3.add(Conv2D(32, (3,3), input_shape=(28,28,1), padding='same')) 
convmodel3.add(Activation('relu'))
convmodel3.add(MaxPooling2D(pool_size=(2,2)))

convmodel3.add(Conv2D(16, (3,3), padding='same'))
convmodel3.add(Activation('relu'))
convmodel3.add(MaxPooling2D(pool_size=(2,2)))

convmodel3.add(Conv2D(8, (3,3), padding='same'))
convmodel3.add(Activation('relu'))
convmodel3.add(MaxPooling2D(pool_size=(2,2))) # ADDED MAXPOOL

################## HEY WHAT NO MAXPOOLING?

convmodel3.add(UpSampling2D((2,2))) # ADDED UPSAMPLING
convmodel3.add(Conv2D(16, (3,3), padding='same'))
convmodel3.add(Activation('relu'))
convmodel3.add(UpSampling2D((2,2)))

convmodel3.add(Conv2D(32, (3,3), padding='same'))
convmodel3.add(Activation('relu'))
convmodel3.add(UpSampling2D((2,2)))

convmodel3.add(Conv2D(1, (3,3), padding='same'))

convmodel3.compile(optimizer='adam', loss='mean_squared_error', metrics=['accuracy'])
convmodel3.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_247 (Conv2D)          (None, 28, 28, 32)        320       
_________________________________________________________________
activation_238 (Activation)  (None, 28, 28, 32)        0         
_________________________________________________________________
max_pooling2d_141 (MaxPoolin (None, 14, 14, 32)        0         
_________________________________________________________________
conv2d_248 (Conv2D)          (None, 14, 14, 16)        4624      
_________________________________________________________________
activation_239 (Activation)  (None, 14, 14, 16)        0         
_________________________________________________________________
max_pooling2d_142 (MaxPoolin (None, 7, 7, 16)          0         
_________________________________________________________________
conv2d_249 (Conv2D)          (None, 7, 7, 8)           1160      
_________________________________________________________________
activation_240 (Activation)  (None, 7, 7, 8)           0         
_________________________________________________________________
max_pooling2d_143 (MaxPoolin (None, 3, 3, 8)           0         
_________________________________________________________________
up_sampling2d_60 (UpSampling (None, 6, 6, 8)           0         
_________________________________________________________________
conv2d_250 (Conv2D)          (None, 6, 6, 16)          1168      
_________________________________________________________________
activation_241 (Activation)  (None, 6, 6, 16)          0         
_________________________________________________________________
up_sampling2d_61 (UpSampling (None, 12, 12, 16)        0         
_________________________________________________________________
conv2d_251 (Conv2D)          (None, 12, 12, 32)        4640      
_________________________________________________________________
activation_242 (Activation)  (None, 12, 12, 32)        0         
_________________________________________________________________
up_sampling2d_62 (UpSampling (None, 24, 24, 32)        0         
_________________________________________________________________
conv2d_252 (Conv2D)          (None, 24, 24, 1)         289       
=================================================================
Total params: 12,201
Trainable params: 12,201
Non-trainable params: 0
_________________________________________________________________

在输出形状中None的意义是什么?在

此外,编辑Conv2D层以不包括填充,则会引发错误:

ValueError: Negative dimension size caused by subtracting 3 from 2 for 'conv2d_240/convolution' (op: 'Conv2D') with input shapes: [?,2,2,16], [3,3,16,32].

为什么?在


Tags: noneaddsizeactivationrelusamepoolpadding
1条回答
网友
1楼 · 发布于 2024-09-30 12:20:34

对于卷积(这里是2D)层,需要考虑的重要点是图像的体积(宽x高x深)和您给它的四个参数。这些参数是

  • 过滤器数量K
  • 过滤器尺寸(空间)F
  • 过滤器在S处移动的步幅
  • 零填充P

输出形状的公式如下所示

  1. Wnew=(W-F+2*P)/S+1
  2. Hnew=(H-F+2*P)/S+1
  3. D新=K

这是从这个线程what is the effect of tf.nn.conv2d() on an input tensor shape?获取的,关于零填充之类的更多信息可以在那里找到。在

至于maxpooling和upsampling,大小只受池大小和步长的影响。在您的示例中,池大小为(2,2),但没有定义跨距(因此它将默认为池大小,请参见此处https://keras.io/layers/pooling/)。上采样的工作原理相同。池大小就是取一个2x2像素的池,求出它们的和并将它们放入一个像素中。因此,将2x2像素转换为1x1像素,并对其进行编码。上采样是相同的事情,但不是求和像素值,而是在池中重复这些值。在

您没有maxpooling层和图像尺寸混乱的原因是由于该阶段的图像大小。从网络上看,图像尺寸已经是[7,7,8]。如果池大小和步长分别为(2,2)和2,则会将图像的分辨率降低到[3,3,8]。在上采样层之后,维度将从3->;6->;12->;24变为,每行和每列都会丢失4个像素。在

无的意义(如果我错了,请纠正我,我不能百分之百肯定)是因为网络期望在卷积层通常有多个图像。通常期望的维度是

[Number of images, Width, Height, Depth]

因此,第一个元素被指定为“无”的原因是您的网络一次只需要一个图像,因此它被指定为“无”(同样,我对这一点不太确定)。在

相关问题 更多 >

    热门问题