我有一个简单的神经网络,它接受3个输入,产生1个输出。我有一个反向传播算法,我一直在用它来训练网络。它似乎适用于1层和2层网络,但对于3层网络,无论输入如何,我总是得到0.5的输出
我自己从Michael Nielsen的书和james Loy的博客中推导出了算法。它适用于1层和2层,但我在网上找不到关于为什么它不适用于更多层的信息
def backprop(self):
# application of the chain rule to find derivative of the loss function with respect to weights
output_error = 2 * (self.y - self.output) * sigmoid_derivative(self.output)
d_weights3 = np.dot(self.layer2.T, output_error)
previous_error = output_error
error_new = np.dot(previous_error, self.weights3.T)*sigmoid_derivative(self.layer2)
d_weights2 = np.dot(self.layer1.T, error_new)
previous_error = error_new
d_weights1 = np.dot(self.input.T, (np.dot(previous_error, self.weights2.T) * sigmoid_derivative(self.layer1)))
# update the weights with the derivative (slope) of the loss function
self.weights1 += d_weights1
self.weights2 += d_weights2
self.weights3 += d_weights3
目前没有回答
相关问题 更多 >
编程相关推荐