使用Keras实现Seq2Seq与图像序列

2024-05-19 18:46:40 发布

您现在位置：Python中文网/ 问答频道 /正文

8385

网友

男 | 程序猿一只，喜欢编程写python代码。

我有一个问题，输入作为图像序列，输出作为标签序列，这与图像中的每一帧相对应。输入格式如下图所示：

目标：根据输入序列列表[img 0, img 1, img 2, img 3]预测标签列表[label 0, label 1, label 2, label 3]。培训模式的输出应为：

P([label 0, label 1, label 2, label 3]/[img 0, img 1, img 2, img 3])。在

label 0依赖于img 0，并且与{}相关。其他标签也依赖于输入序列中的所有图像。因此，这使得目标标签既依赖于单个图像中的空间信息，又依赖于时间信息。在

因此，我计划使用卷积神经网络（CNN）对每个img的空间信息进行编码。同时，如何用LSTM编码img序列的时间信息？

这是我的代码：

from keras.models import Sequential, Model
from keras.layers import Dense, Conv2D, LSTM, Flatten, TimeDistributed, RepeatVector
from keras.layers.normalization import BatchNormalization

def cnn_lstm():

    model = Sequential()

    # CNN module
    model.add(TimeDistributed(Conv2D(filters = 8, 
                                    kernel_size = (2, 2), 
                                    padding = 'same',
                                    activation='relu',
                                    name = 'Conv_1'),
                                    input_shape = (None, img_height, img_width, channels)))

    model.add(TimeDistributed(BatchNormalization(name='BN_1')))
    model.add(TimeDistributed(MaxPooling2D()))

    model.add(TimeDistributed(Conv2D(filters = 8, 
                                    kernel_size = (2, 2), 
                                    padding = 'same',
                                    activation='relu',
                                    name = 'Conv_2')))

    model.add(TimeDistributed(BatchNormalization(name='BN_2')))
    model.add(TimeDistributed(MaxPooling2D()))

    # Flatten all features from CNN before inputing them into LSTM
    model.add(TimeDistributed(Flatten()))

    # LSTM module
    model.add(LSTM(50))
    model.add(RepeatVector(output_seq_length))

    model.add(LSTM(50, return_sequences=True, name = 'decoder'))
    model.add(TimeDistributed(Dense(nb_classes, activation='softmax')))

    model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])

    return model

本例中的output_seq_length = 4和{}。在

我的模型能达到我的预期吗？

如果它是一个seq2seq问题，那么它看起来并没有涉及到“教师强迫”，如本文tutorial所示。在

有没有一种方法可以利用CNN编码空间信息和LSTM同时编码时间信息？就像CNN和编解码器LSTM的组合？在

欢迎任何意见！非常感谢你！在

Tags： name from 图像 add 编码 img model 时间

1条回答

网友

1楼 · 发布于 2024-05-19 18:46:40

您的模型很可能可以工作，但是您可以通过使用Bidirectional包装器来两种方式累积信息来改进LSTMs。您也不希望压缩第一个LSTM，并在标记每个图像时返回序列。在

CNN和LSTM的结合？Keras有一个ConvLSTM2D层，它将对输入和递归组件应用卷积。在

使用Keras实现Seq2Seq与图像序列

相关问题更多 >

编程相关推荐

热门问题

热门文章

使用Keras实现Seq2Seq与图像序列

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >