<p>我读过这篇<a href="https://towardsdatascience.com/emulating-logical-gates-with-a-neural-network-75c229ec4cc9" rel="nofollow noreferrer">https://towardsdatascience.com/emulating-logical-gates-with-a-neural-network-75c229ec4cc9</a>,也有人说,为了获得良好的训练效果,需要更深层次(有(多个)隐藏层)的网络,提到的原因有:</p>
<blockquote>
<p><strong>Training and Learning</strong></p>
<p><strong>Now we have shown that this neural network is possible, now the
remaining question is, it is possible to train. Can we expect that if
we simply fed in the data drawn from the graph above after defining
the layers, number of neurons and activation functions correctly, the
network will train in this way?</strong></p>
<p><strong>No, not always, and not even often. The problem, like with many neural
networks is one of optimization. In training this network it will
often get stuck in a local minimum even though a near-perfect solution
exists. This is where your optimization algorithm may play a large
role, and this is something which Tensorflow Playground doesn’t allow
you to change and may be the subject of a future post.</strong></p>
<p>[...]</p>
<p><strong>After you have built this network by manually inputting the weights,
why not try to train the weights of this this network from scratch
instead of constructing it in manually. I have managed to do this
after many trials, but I believe it is quite sensitive to the seeding
and often ends up in local minimums. If you find a reliable way to
train this network using these features and this network structure
please reach out in the comments.</strong></p>
<p>Try to build this network using the only this number of neurons and
layers. In this article I have shown that it is possible to do it with
this many neurons only. If you introduce any more nodes then you will
certainly have some redundant neurons. <strong>Although, with more
neurons/layers, I have had better luck in training a good model more
consistently.</strong></p>
</blockquote>
<p>这个问题可能与神经网络的乘法问题有关。平坦(或非深层/无隐藏层)神经网络不能执行简单的乘法cf<strong><a href="https://stats.stackexchange.com/questions/217703/can-deep-neural-network-approximate-multiplication-function-without-normalizatio">https://stats.stackexchange.com/questions/217703/can-deep-neural-network-approximate-multiplication-function-without-normalizatio</a></strong></p>
<p>更新(评论)</p>
<p>老实说,我不确定MSE误差函数,因为它在分类问题中不好,cf<a href="https://towardsdatascience.com/why-using-mean-squared-error-mse-cost-function-for-binary-classification-is-a-bad-idea-933089e90df7" rel="nofollow noreferrer">https://towardsdatascience.com/why-using-mean-squared-error-mse-cost-function-for-binary-classification-is-a-bad-idea-933089e90df7</a>和<strong><a href="https://medium.com/autonomous-agents/how-to-teach-logic-to-your-neuralnetworks-116215c71a49" rel="nofollow noreferrer">https://medium.com/autonomous-agents/how-to-teach-logic-to-your-neuralnetworks-116215c71a49</a></strong>(使用<em>负对数似然</em>误差函数,也称为<em>多类交叉熵</em>)和<a href="https://machinelearningmastery.com/how-to-choose-loss-functions-when-training-deep-learning-neural-networks/" rel="nofollow noreferrer">https://machinelearningmastery.com/how-to-choose-loss-functions-when-training-deep-learning-neural-networks/</a>:</p>
<blockquote>
<p>Mean Squared Error Loss</p>
<p><strong>The Mean Squared Error, or MSE, loss is the default loss to use for regression [not classification] problems.</strong></p>
</blockquote>
<p><em>来源:<a href="https://machinelearningmastery.com/how-to-choose-loss-functions-when-training-deep-learning-neural-networks/" rel="nofollow noreferrer">https://machinelearningmastery.com/how-to-choose-loss-functions-when-training-deep-learning-neural-networks/</a></em></p>
<p>训练两个标签或类(<code>True</code>,<code>False</code>)是一个分类问题,而不是回归问题。你知道吗</p>
<p>然而,我认为主要的系统问题是网络不够深入。正如在文章<a href="https://towardsdatascience.com/emulating-logical-gates-with-a-neural-network-75c229ec4cc9" rel="nofollow noreferrer">https://towardsdatascience.com/emulating-logical-gates-with-a-neural-network-75c229ec4cc9</a>中所说的,可以对初始权重的组合进行种子化,以避免局部极小,但这也不能解决基本问题(网络不够深,错误的误差函数(MSE))。你知道吗</p>
<p>在<a href="https://towardsdatascience.com/lets-code-a-neural-network-in-plain-numpy-ae7e74410795" rel="nofollow noreferrer">https://towardsdatascience.com/lets-code-a-neural-network-in-plain-numpy-ae7e74410795</a>中是一个用于分类的神经网络的numpy实现,它包括一个<em>二进制交叉熵</em>错误函数的实现,也许可以将此与您的代码进行比较。你知道吗</p>