在keras中对一组文档应用共享嵌入层

Q = Input(shape=(5, )) # each query is made of 5 words T = Input(shape=(50, 50)) # each search result is made of 50 words and 50 docs emb = Embedding( max_val, embedding_dims, dropout=embedding_dropout ) left = emb(Q) left = Convolution1D(nb_filter=5, filter_length=5, border_mode='valid', activation='relu', subsample_length=1)(left) left = GlobalMaxPooling1D()(left) print(left) right = emb(T) # <-- this is my problem, I don't really know what to do/apply here def merger(vests): x, y = vests x = K.l2_normalize(x, axis=0) # Normalize rows y = K.l2_normalize(y, axis=-1) # Normalize the vector return tf.matmul(x, y) # obviously throws an error because of mismatching matrix ranks def cos_dist_output_shape(shapes): shape1, shape2 = shapes return (50, 1) merger_f = Lambda(merger) predictions = merge([left, right], output_shape=cos_dist_output_shape, mode=merger_f) model = Model(input=[Q, T], output=predictions) def custom_objective(y_true, y_pred): ordered_output = tf.cast(tf.nn.top_k(y_pred)[1], tf.float32) # returns the indices of the top values return K.mean(K.square(ordered_output - y_true), axis=-1) model.compile(optimizer='adam', loss=custom_objective)

1条回答

网友

1楼 · 发布于 2024-09-26 22:50:47

好吧。如果我理解正确的情况，你有50个长度为50的文本片段，你想嵌入。

完成单词嵌入后，你会发现自己有一个形状的张量T（50,50，emb_size）。我要做的是在时间分布的包装器中使用LSTM层。在emb(T)之后添加这些行：

right = TimeDistributed(LSTM(5))(right)

这将对50个文档中的每个文档应用相同的LSTM，并在每个文档处理结束时输出长度为5的最终状态。这一步后右边的形状是（50,5）。您已经将每个文档嵌入到一个长度为5的向量中。 TimeDistributed的优点是，应用于每个文档的LSTM将共享相同的权重，因此您的文档将以相同的方式“处理”。您可以找到关于LSTM here和关于时间分布here的文档。

我希望这能有所帮助。

相关问题更多 >

编程相关推荐

热门问题

热门文章