时间分布(密集)与密集和时间分布(Conv2D)

2024-09-30 14:35:35 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图理解TimeDitributed()层在keras中的工作原理!!我知道,当我们在TimeDitributed()中包装Conv2D层时,它将相同的Conv2D层应用于视频的所有时间事件(或视频序列中存在的不同帧)。正如这里提到的https://www.tensorflow.org/api_docs/python/tf/keras/layers/TimeDistributed

在我的项目中,我试图构建一个LSTM模型,其如下所示:

class Lstm_model_1(tf.keras.Model):

    def __init__(self, num_classes):
        super(Lstm_model_1, self).__init__()   
        self.Lstm1 = tf.keras.layers.LSTM(32,return_sequences=True)
        self.Lstm2 = tf.keras.layers.LSTM(32,return_sequences=True) 
        self.classifier = tf.keras.layers.Dense(num_classes, activation='softmax')
        self.TimeDistributed=tf.keras.layers.TimeDistributed(self.classifier)

    def call(self, inputs):
        input_A=inputs
        x = self.Lstm1(input_A)
        x = self.Lstm2(x)
        output = self.TimeDistributed(x)
        
        return  output
lstm_1 = Lstm_model_1(3)
lstm_1.compile(optimizer='adam', loss=tf.keras.losses.CategoricalCrossentropy())
lstm_1.fit(X_train,Y_train, epochs=3,validation_data=(X_test,Y_test))
lstm_1.summary()
Model: "lstm_model_1_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_9 (LSTM)                multiple                  55552     
_________________________________________________________________
lstm_10 (LSTM)               multiple                  8320      
_________________________________________________________________
dense_6 (Dense)              multiple                  99        
_________________________________________________________________
time_distributed (TimeDistri multiple                  99        
=================================================================
Total params: 63,971
Trainable params: 63,971
Non-trainable params: 0
_________________________________________________________________

这里我得到了TimeDistributed()层中的99个参数

现在,当我不使用TimeDistributed()层时,我得到了相同数量的参数,即99

我在以下帖子中读到:

If return_sequences=True, then the Dense layer is used to apply at every timestep just like TimeDistributedDense.


As a side note: this makes TimeDistributed(Dense(...)) and Dense(...) equivalent to each other.


Another side note: be aware that this has the effect of shared weights.""


  1. TimeDistributed(Dense) vs Dense in seq2seq
  2. Keras Dense layer's input is not flattened

现在,据我所知,当在LSTMreturn_sequences=True上应用致密层时,所有时间戳的权重应该相同,这是有道理的。但我有以下几个问题

  1. Dense()包装的TimeDitributed()是多余的吗?我们可以直接使用Dense()
  2. 如果我不想使用与序列输出相对应的共享权重,那么我应该怎么做?我希望我的网络在return_sequences=True的情况下学习对应于每个输出的不同权重集
  3. 如果Dense()层和TimeDitributed()层在时间序列中共享权重,为什么我们仍然将Dense()层包装在TimeDitributed()层中?我已经看到了TimeDitributed()层与RepeatedVector()层在这里的用法https://datascience.stackexchange.com/questions/46491/what-is-the-job-of-repeatvector-and-timedistributed
  4. 只是在Dense()的情况下TimeDitribued()是冗余的,还是在Conv2D层上也是相同的

Tags: selftruemodelreturnlayerstfmultiplekeras