在keras中实现swapout层

2024-09-30 23:39:59 发布

您现在位置:Python中文网/ 问答频道 /正文

我目前正在尝试在keras中实现swapout层(如:http://papers.nips.cc/paper/6205-swapout-learning-an-ensemble-of-deep-architectures.pdf)。如果我正确理解了这个概念,比如说,正常致密层的输出就是被操纵的。所以在输出0,x,F(x)和x+F(x)之间随机选择,其中x是输入,F(x)是层将产生的正常输出

因此,我尝试使用稠密层作为基础,并像这样编辑call方法(其他所有内容与稠密层中完全相同):

def call(self, inputs):
        output = K.dot(inputs, self.kernel)
        if self.use_bias:
            output = K.bias_add(output, self.bias, data_format='channels_last')
#--------- edited -----------       
        theta1 = np.random.randint(2, size=(1, 256))
        theta2 = np.random.randint(2, size=(1, 256))
        theta1 = theta1.astype(np.float32)
        theta2 = theta2.astype(np.float32)

        output = tf.add(tf.multiply(theta1, inputs), tf.multiply(theta2, output))
# -----------------------------------
        if self.activation is not None:
            output = self.activation(output)
        return output

所以我只是计算输出,就像在论文中,Y=Theta1*inputs+Theta2*output,其中*是元素相乘。输出的形状似乎与没有此计算时保持不变,但当我尝试在以下模型中运行它时:

import tensorflow as tf
from keras.models import Sequential
from keras.layers import Dense, Flatten
from swapout import Swapout

mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = Sequential()
model.add(Flatten(input_shape = (28,28)))
model.add(Swapout(256, activation='relu'))
model.add(Dense(10, activation='softmax'))

model.compile(optimizer='adam',
                      loss='sparse_categorical_crossentropy',
                      metrics=['accuracy'])

model.fit(x_train, y_train, epochs=5)
res = model.evaluate(x_test, y_test, verbose=2)

虽然我得到以下错误:

2019-12-21 22:59:31.924209: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Invalid argument: Incompatible shapes: [32,784] vs. [1,256]
     [[{{node swapout_1/Mul}}]]
Traceback (most recent call last):
  File "/home/andreas/studium/semester9/vertiefungs/neural.py", line 294, in <module>
    runSingleMnistExperimant()
  File "/home/andreas/studium/semester9/vertiefungs/neural.py", line 220, in runSingleMnistExperimant
    model.fit(x_train, y_train, epochs=5)
  File "/home/andreas/studium/my_tensorflow/venv/lib/python3.6/site-packages/keras/engine/training.py", line 1239, in fit
    validation_freq=validation_freq)
  File "/home/andreas/studium/my_tensorflow/venv/lib/python3.6/site-packages/keras/engine/training_arrays.py", line 196, in fit_loop
    outs = fit_function(ins_batch)
  File "/home/andreas/studium/my_tensorflow/venv/lib/python3.6/site-packages/tensorflow_core/python/keras/backend.py", line 3740, in __call__
    outputs = self._graph_fn(*converted_inputs)
  File "/home/andreas/studium/my_tensorflow/venv/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1081, in __call__
    return self._call_impl(args, kwargs)
  File "/home/andreas/studium/my_tensorflow/venv/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1121, in _call_impl
    return self._call_flat(args, self.captured_inputs, cancellation_manager)
  File "/home/andreas/studium/my_tensorflow/venv/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1224, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager)
  File "/home/andreas/studium/my_tensorflow/venv/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 511, in call
    ctx=ctx)
  File "/home/andreas/studium/my_tensorflow/venv/lib/python3.6/site-packages/tensorflow_core/python/eager/execute.py", line 67, in quick_execute
    six.raise_from(core._status_to_exception(e.code, message), None)
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError:  Incompatible shapes: [32,784] vs. [1,256]
     [[node swapout_1/Mul (defined at /my_tensorflow/venv/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py:1751) ]] [Op:__inference_keras_scratch_graph_761]

Function call stack:
keras_scratch_graph

在计算中我做错了什么?有没有更好/更简单的方法来实现swapout层? 我对Keras和tensorflow很陌生;)

非常感谢你的帮助


更新:我刚刚解决了问题。问题是输入的形状(无,784)。当然,这不能与(1256)向量按元素相乘。另外,我还必须将加法中的一个求和与一个矩阵相乘,使它们具有相同的形状…我选择了零填充

以下是解决方案:

def call(self, inputs):
    output = K.dot(inputs, self.kernel)
    if self.use_bias:
        output = K.bias_add(output, self.bias, data_format='channels_last')

    inputLength = K.int_shape(inputs)[-1]
    outputLength = K.int_shape(output)[-1]
    theta1 = np.random.randint(2, size=(1,inputLength))
    theta2 = np.random.randint(2, size=(1,outputLength))
    theta1 = theta1.astype(np.float32)
    theta2 = theta2.astype(np.float32)

    zeroPadding = getZeroPaddingMatrix(inputLength, outputLength)

    output = tf.add(tf.matmul(tf.multiply(theta1, inputs), zeroPadding), tf.multiply(theta2, output))

    if self.activation is not None:
        output = s
    return output

零填充矩阵的计算方式如下:

def getZeroPaddingMatrix(inputLength, outputLength):
    zeroPadding = np.identity(inputLength)
    if(outputLength > inputLength):
        zeros = np.zeros((inputLength, outputLength-inputLength))
        zeroPadding = np.concatenate((zeroPadding, zeros), axis=1)
    elif(outputLength < inputLength):
        zeroPadding =  zeroPadding[:,(inputLength-outputLength) :]
    return zeroPadding.astype(np.float32)

唯一要做的就是添加参数来改变θ1和θ2


Tags: inpyselfhomeoutputtensorflownpline