我试图理解TimeDitributed()
层在keras中的工作原理!!我知道,当我们在TimeDitributed()
中包装Conv2D
层时,它将相同的Conv2D层应用于视频的所有时间事件(或视频序列中存在的不同帧)。正如这里提到的https://www.tensorflow.org/api_docs/python/tf/keras/layers/TimeDistributed
在我的项目中,我试图构建一个LSTM模型,其如下所示:
class Lstm_model_1(tf.keras.Model):
def __init__(self, num_classes):
super(Lstm_model_1, self).__init__()
self.Lstm1 = tf.keras.layers.LSTM(32,return_sequences=True)
self.Lstm2 = tf.keras.layers.LSTM(32,return_sequences=True)
self.classifier = tf.keras.layers.Dense(num_classes, activation='softmax')
self.TimeDistributed=tf.keras.layers.TimeDistributed(self.classifier)
def call(self, inputs):
input_A=inputs
x = self.Lstm1(input_A)
x = self.Lstm2(x)
output = self.TimeDistributed(x)
return output
lstm_1 = Lstm_model_1(3)
lstm_1.compile(optimizer='adam', loss=tf.keras.losses.CategoricalCrossentropy())
lstm_1.fit(X_train,Y_train, epochs=3,validation_data=(X_test,Y_test))
lstm_1.summary()
Model: "lstm_model_1_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_9 (LSTM) multiple 55552
_________________________________________________________________
lstm_10 (LSTM) multiple 8320
_________________________________________________________________
dense_6 (Dense) multiple 99
_________________________________________________________________
time_distributed (TimeDistri multiple 99
=================================================================
Total params: 63,971
Trainable params: 63,971
Non-trainable params: 0
_________________________________________________________________
这里我得到了TimeDistributed()
层中的99个参数
现在,当我不使用TimeDistributed()
层时,我得到了相同数量的参数,即99
我在以下帖子中读到:
If return_sequences=True, then the Dense layer is used to apply at every timestep just like TimeDistributedDense.
及
As a side note: this makes TimeDistributed(Dense(...)) and Dense(...) equivalent to each other.
Another side note: be aware that this has the effect of shared weights.""
现在,据我所知,当在LSTMreturn_sequences=True
上应用致密层时,所有时间戳的权重应该相同,这是有道理的。但我有以下几个问题
Dense()
包装的TimeDitributed()
是多余的吗?我们可以直接使用Dense()
吗李>return_sequences=True
的情况下学习对应于每个输出的不同权重集Dense()
层和TimeDitributed()
层在时间序列中共享权重,为什么我们仍然将Dense()
层包装在TimeDitributed()
层中?我已经看到了TimeDitributed()
层与RepeatedVector()
层在这里的用法https://datascience.stackexchange.com/questions/46491/what-is-the-job-of-repeatvector-and-timedistributedDense()
的情况下TimeDitribued()
是冗余的,还是在Conv2D
层上也是相同的李>
目前没有回答
相关问题 更多 >
编程相关推荐