使用theano实现神经概率语言模型Python中的最大似然学习

2024-10-03 17:18:40 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图从log双线性模型的代码中实现python中神经概率语言模型的最大似然学习: https://github.com/wenjieguan/Log-bilinear-language-models/blob/master/lbl.py

我使用了theano中的梯度函数来计算梯度,并尝试用函数列来更新模型的参数,但结果有误差。这是我的代码:

def train(self, sentences, alpha = 0.001, batches = 1000):
    print('Start training...')
    self.alpha = alpha
    count = 0

    RARE = self.vocab['<>']
    #print RARE
    q = np.zeros(self.dim, np.float32)
    #print q
    delta_context = [np.zeros((self.dim, self.dim), np.float32) for i in range(self.context) ]
    #print delta_context
    delta_feature = np.zeros((len(self.vocab), self.dim), np.float32)
    #print delta_feature
    for sentence in sentences:
        sentence = self.start_sen + sentence + self.end_sen
        for pos in range(self.context, len(sentence) ):
            count += 1
            q.fill(0)
            featureW = []
            contextMatrix = []
            indices = []
            for i, r in enumerate(sentence[pos - self.context : pos]):
                if r == '<_>':
                    continue
                index = self.vocab.get(r, RARE)
                print index
                indices.append(index)
                ri = self.featureVectors[index]
                #print ri
                ci = self.contextMatrix[i]
                #print ci
                featureW.append(ri)
                contextMatrix.append(ci)
                #Caculating predicted representation for the target word
                q += np.dot(ci, ri)
            #Computing energy function
            energy = np.exp(np.dot(self.featureVectors, q) + self.biases)
            #print energy
            #Computing the conditional distribution
            probs = energy / np.sum(energy)
            #print probs
            w_index = self.vocab.get(sentence[pos], RARE)


            #Computing gradient
            logProbs = T.log(probs[w_index])
            print 'Gradient start...'
            delta_context, delta_feature = T.grad(logProbs, [self.contextMatrix, self.featureVectors])
            print 'Gradient completed!'
            train = theano.function(
                                    inputs = [self.vocab],
                                    outputs = [logProbs],
                                    updates=((self.featureVectors,self.featureVectors - self.alpha * delta_feature), 
                                             (self.contextMatrix,self.contextMatrix - self.alpha * delta_context)),
                                    name="train"
                                    )



    print('Training is finished!')

我刚刚学习了Python和神经概率语言模型,所以这对我来说相当困难。 你能帮帮我吗!谢谢您!在


Tags: 模型selfalphaforindexnpcontextsentence