`for`循环到Pythorch中的多维数组

2024-06-03 00:12:27 发布

您现在位置:Python中文网/ 问答频道 /正文

我想实施有注意机制的问答系统。我有两个输入;contextquery,它们是(batch_size, context_seq_len, embd_size)和{}。
我遵循下面的文件。 使用匹配LSTM和答案指针进行机器理解。https://arxiv.org/abs/1608.07905

然后,得到一个注意矩阵,它的形状是(batch_size, context_seq_len, query_seq_len, embd_size)。在论文中,他们计算每一行的值(在论文中指的是每一个上下文单词,G峎i,alpha峎i)。在

我的代码在下面,它正在运行。但我不确定我的方式是好是坏。例如,我使用for loop来生成序列数据(for i in range(T):)。为了获得每一行,我使用了像G[:,i,:,:]embd_context[:,i,:].clone()这样的就地运算符在pytorch中是一种很好的方式吗?如果没有,我应该在哪里更改代码?在

如果你注意到其他问题,请告诉我。我是这个领域的新人。对不起,我的问题模棱两可。在

class MatchLSTM(nn.Module):
    def __init__(self, args):
        super(MatchLSTM, self).__init__()
        self.embd_size = args.embd_size
        d = self.embd_size
        self.answer_token_len = args.answer_token_len

        self.embd = WordEmbedding(args)
        self.ctx_rnn   = nn.GRU(d, d, dropout = 0.2)
        self.query_rnn = nn.GRU(d, d, dropout = 0.2)

        self.ptr_net = PointerNetwork(d, d, self.answer_token_len) # TBD

        self.w  = nn.Parameter(torch.rand(1, d, 1).type(torch.FloatTensor), requires_grad=True) # (1, 1, d)
        self.Wq = nn.Parameter(torch.rand(1, d, d).type(torch.FloatTensor), requires_grad=True) # (1, d, d)
        self.Wp = nn.Parameter(torch.rand(1, d, d).type(torch.FloatTensor), requires_grad=True) # (1, d, d)
        self.Wr = nn.Parameter(torch.rand(1, d, d).type(torch.FloatTensor), requires_grad=True) # (1, d, d)

        self.match_lstm_cell = nn.LSTMCell(2*d, d)

    def forward(self, context, query):
        # params
        d = self.embd_size
        bs = context.size(0) # batch size
        T = context.size(1)  # context length 
        J = query.size(1)    # query length

        # LSTM Preprocessing Layer
        shape = (bs, T, J, d)
        embd_context     = self.embd(context)         # (N, T, d)
        embd_context, _h = self.ctx_rnn(embd_context) # (N, T, d)
        embd_context_ex  = embd_context.unsqueeze(2).expand(shape).contiguous() # (N, T, J, d)
        embd_query       = self.embd(query)           # (N, J, d)
        embd_query, _h   = self.query_rnn(embd_query) # (N, J, d)
        embd_query_ex  = embd_query.unsqueeze(1).expand(shape).contiguous() # (N, T, J, d)

        # Match-LSTM layer
        G = to_var(torch.zeros(bs, T, J, d)) # (N, T, J, d)

        wh_q = torch.bmm(embd_query, self.Wq.expand(bs, d, d)) # (N, J, d) = (N, J, d)(N, d, d)

        hidden     = to_var(torch.randn([bs, d])) # (N, d)
        cell_state = to_var(torch.randn([bs, d])) # (N, d)
        # TODO bidirectional
        H_r = [hidden]
        for i in range(T):
            wh_p_i = torch.bmm(embd_context[:,i,:].clone().unsqueeze(1), self.Wp.expand(bs, d, d)).squeeze() # (N, 1, d) -> (N, d)
            wh_r_i = torch.bmm(hidden.unsqueeze(1), self.Wr.expand(bs, d, d)).squeeze() # (N, 1, d) -> (N, d)
            sec_elm = (wh_p_i + wh_r_i).unsqueeze(1).expand(bs, J, d) # (N, J, d)

            G[:,i,:,:] = F.tanh( (wh_q + sec_elm).view(-1, d) ).view(bs, J, d) # (N, J, d) # TODO bias

            attn_i = torch.bmm(G[:,i,:,:].clone(), self.w.expand(bs, d, 1)).squeeze() # (N, J)
            attn_query = torch.bmm(attn_i.unsqueeze(1), embd_query).squeeze() # (N, d) 
            z = torch.cat((embd_context[:,i,:], attn_query), 1) # (N, 2d)

            hidden, cell_state = self.match_lstm_cell(z, (hidden, cell_state)) # (N, d), (N, d)
            H_r.append(hidden)
        H_r = torch.stack(H_r, dim=1) # (N, T, d)

        indices = self.ptr_net(H_r) # (N, M, T) , M means (start, end)
        return indices

Tags: selfsizelenbscontextcellnntorch
1条回答
网友
1楼 · 发布于 2024-06-03 00:12:27

我觉得你的代码没问题。你不能避免这个循环:for i in range(T):,因为在论文中的等式(2)(https://openreview.net/pdf?id=B1-q5Pqxl)中,有一个隐藏状态来自Match LSTM cell,它涉及到计算G_i和{}向量,它们被用来计算Match LSTM下一个时间步的输入。因此,您需要为Match LSTM的每个timestep运行循环,我看不到任何方法可以避免for循环。在

相关问题 更多 >