回答表征Keras层的注意机制

2024-09-26 18:13:46 发布

您现在位置：Python中文网/ 问答频道 /正文

1581

网友

男 | 程序猿一只，喜欢编程写python代码。

参考论文“基于问题回答共同注意机制的堆叠BiLSTM神经网络”（https://www.hindawi.com/journals/cin/2019/9543490/），我试图编写一个自定义keras层来实现以下方程组

我的问题和答案的上下文向量形状如下

问题上下文向量的形状（CQ）：（512，256）

答案的上下文向量形状（CA）：（512，256）

预期输出：给定上述两个上下文向量作为输入的余弦分数

这是我的代码

class score(tf.keras.layers.Layer):

    def __init__(self, units):
        super(score, self).__init__()
        # Initializing the weights
        self.Wam = tf.keras.layers.Dense(units) # Attention matrix of CA at time t.
        self.Wqm = tf.keras.layers.Dense(units) # Attention matrix of OQ
        self.Wms = tf.keras.layers.Dense(units) # Attention weight vector

    def call (self, CQ, CA):

        OQ = tf.keras.layers.GlobalMaxPool1D()(CQ)
        print(OQ.shape)
        CA = tf.expand_dims(CA,1)
        #Maq = tf.nn.tanh(tf.matmul(self.Wam, CA) + tf.matmul(self.Wqm, OQ))
        Maq = tf.nn.tanh(self.Wam(CA) + self.Wqm(OQ))
        Saq  = tf.exp(tf.linalg.matmul(Wms, Maq, transpose_a = True, transpose_b = False))
        OA = CA * Saq
        OA = tf.reduce_sum(OA, axis = 1)

        # Calcutate Cosine similarity of vectors OQ and OA
        dot_product = tf.tensordot(OQ, OA, axes=0)
        # Normalize input
        norm_OQ = tf.nn.l2_normalize(OQ)
        norm_OA = tf.nn.l2_normalize(OA)
        scoreCosine = dot_product / (norm_OQ * norm_OA)

        #Normalize the cosine similarity to the [0, 1] interval 
        scoreCosine = 0.5 * (scoreCosine) + 0.5

        return scoreCosine

score_layer = score(10)
scoreCosine = score_layer(CQ, CA)

我在每一步中都会遇到矩阵形状的问题。我试图打破这些步骤，分别执行每一步

我面临的问题：

1）输入矩阵的形状为2D，但GlobalMapool1D层（第一个方程）需要3D输入并生成2D输出

2）我将注意力矩阵（Wam，Wqm）和注意力权重向量（Wms）初始化为10个单位的密集层（此处参考Bahdanau注意力层->；{a4}）。对吗？这对方程3（Wms转置）不起作用，因为它不是张量，而是稠密层

请告知这是否是正确的方法，或者我在哪里做得不正确。非常感谢您的帮助

Tags： self layers tf 向量 ca keras score 形状

0条回答

目前没有回答

回答表征Keras层的注意机制

相关问题更多 >

编程相关推荐

热门问题

热门文章

回答表征Keras层的注意机制

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >