解码器预测不依赖编码器输入

2024-10-02 20:37:29 发布

您现在位置：Python中文网/ 问答频道 /正文

8573

网友

男 | 程序猿一只，喜欢编程写python代码。

我在训练Seq2Seq模型，学习OpenSubtitles对话框-Cornell-Movie-Dialogs-Corpus。在

我的工作基于以下文件：

训练收敛得很好，但是当我检查网络实际预测的内容时，我发现encoder的输入对于decoder的预测根本不重要。重要的是向decoder呈现的文字。在

例如，在标记<s>之后，decoder总是预测单词“i”。在

我的架构的伪代码如下所示（我使用的是Tensorflow 1.5）：

seq1 = tf.placeholder(...)
seq2 = tf.placeholder(...)

embeddings = tf.Variable(tf.zeros([vocab_size, 300]))

seq1_emb = tf.nn.embedding_lookup(embeddings, seq1)
seq2_emb = tf.nn.embedding_lookup(embeddings, seq2)

# Encoder init_state is a random_uniform of: -0.08, 0.08 (according to paper [1])

encoder_out, state1 = tf.nn.dynamic_rnn(BasicLSTMCell(), seq1_emb)
decoder_out, state2 = tf.nn.dynamic_rnn(BasicLSTMCell(), seq2_emb,
                                                        initial_state=state_1)
logit = Dense(decoder_out, use_bias=False)

crossent = tf.nn.saparse_softmax_cross_entropy_with_logits(logits=logit, 
                                                         labels=target)
crossent = mask_padded_zeros(crossent)
loss = tf.reduce_sum(crossent) / number_of_words_in_batch

# Gradient is Clipped-By-Norm with 5 (according to paper [1])

train = tf.train.GradientDescentOptimizer(learning_rate=0.7).minimize(loss)

我会很高兴你的帮助！在

Tags： to encoder tf with nn out state sequence

0条回答

目前没有回答

解码器预测不依赖编码器输入

相关问题更多 >

编程相关推荐

热门问题

热门文章

解码器预测不依赖编码器输入

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >