多层神经网络上的java退火：异或实验

1 周，2 日 Questions & Answers 366

我是这个概念的初学者，我尝试学习前馈型神经网络（拓扑结构为2x2x1）：

Bias and weight range of each neuron_____________Outputs for XOR test inputs
                [-1,1]                           1,1 ----> 0,9            
                                                 1,0 ----> 0,8
                                                 0,1 ---->-0.1
                                                 0,0 ----> 0.1

                [-10,10]                         1,1 ----> 0,24            
                                                 1,0 ----> 0,67
                                                 0,1 ---->-0.54
                                                 0,0 ----> 0.10

                [-4,4]                           1,1 ----> -0,02            
                                                 1,0 ----> 0,80
                                                 0,1 ----> 0.87
                                                 0,0 ----> -0.09

因此，[-4,4]的范围似乎比其他的好

问题：与温度限制和温度下降率相比，有没有办法找到合适的重量和偏差限制

注意：我在这里尝试两种方法。首先是对每个试验的所有权重和偏差进行一次随机化。第二种方法是在每次试验中仅随机分配单个重量和单个偏差。（降低温度前50次迭代）。单次重量变化的结果更糟

(n+1) is next value, (n) is the value before TempMax=2.0 TempMin=0.1 ----->approaching to zero, error of XOR output approaches to zero too Temp(n+1)=Temp(n)/1.001 Weight update: w(n+1)=w(n)+(float)(Math.random()*t*2.0f-t*1.0f)); // t is temperature (same for bias update) Iterations per temperature=50 Using java's Math.random() method(Spectral property is appropriate for annealing?) Transition probability: (1.0f/(1.0f+Math.exp(((candidate state error)-(old error))/temp))) Neuron activation function: Math.tanh()

经过多次尝试，结果几乎相同。重新清理是否是逃避更深的局部极小值的唯一解决方案

我需要一个合适的重量/偏差范围/限制，根据总神经元数量和层数以及启动/发动机温度。3x6x5x6x1可以计算3位输入并给出输出，可以近似阶跃函数，但我需要始终使用范围

对于该训练数据集，输出误差太大（193个数据点，2个输入，1个输出）：

1932 1 0.499995 0.653846 1. 0.544418 0.481604 1. 0.620200 0.320118 1. 0.595191 0.404816 0 0.404809 0.595184 1. 0.171310 0.636142 0 0.014323 0.403392 0 0.617884 0.476556 0 0.391548 0.478424 1. 0.455912 0.721618 0 0.615385 0.500005 0 0.268835 0.268827 0 0.812761 0.187243 0 0.076923 0.499997 1. 0.769231 0.500006 0 0.650862 0.864223 0 0.799812 0.299678 1. 0.328106 0.614848 0 0.591985 0.722088 0 0.692308 0.500005 1. 0.899757 0.334418 0 0.484058 0.419839 1. 0.200188 0.700322 0 0.863769 0.256940 0 0.384615 0.499995 1. 0.457562 0.508439 0 0.515942 0.580161 0 0.844219 0.431535 1. 0.456027 0.529379 0 0.235571 0.104252 0 0.260149 0.400644 1. 0.500003 0.423077 1. 0.544088 0.278382 1. 0.597716 0.540480 0 0.562549 0.651021 1. 0.574101 0.127491 1. 0.545953 0.731052 0 0.649585 0.350424 1. 0.607934 0.427886 0 0.499995 0.807692 1. 0.437451 0.348979 0 0.382116 0.523444 1. 1 0.500000 1. 0.731165 0.731173 1. 0.500002 0.038462 0 0.683896 0.536585 1. 0.910232 0.581604 0 0.499998 0.961538 1. 0.903742 0.769772 1. 0.543973 0.470621 1. 0.593481 0.639914 1. 0.240659 0.448408 1. 0.425899 0.872509 0 0 0.500000 0 0.500006 0.269231 1. 0.155781 0.568465 0 0.096258 0.230228 0 0.583945 0.556095 0 0.550746 0.575954 0 0.680302 0.935290 1. 0.693329 0.461550 1. 0.500005 0.192308 0 0.230769 0.499994 1. 0.721691 0.831791 0 0.621423 0.793156 1. 0.735853 0.342415 0 0.402284 0.459520 1. 0.589105 0.052045 0 0.189081 0.371208 0 0.533114 0.579952 0 0.251594 0.871762 1. 0.764429 0.895748 1. 0.499994 0.730769 0 0.415362 0.704317 0 0.422537 0.615923 1. 0.337064 0.743842 1. 0.560960 0.806496 1. 0.810919 0.628792 1. 0.319698 0.064710 0 0.757622 0.393295 0 0.577463 0.384077 0 0.349138 0.135777 1. 0.165214 0.433402 0 0.241631 0.758362 0 0.118012 0.341772 1. 0.514072 0.429271 1. 0.676772 0.676781 0 0.294328 0.807801 0 0.153846 0.499995 0 0.500005 0.346154 0 0.307692 0.499995 0 0.615487 0.452168 0 0.466886 0.420048 1. 0.440905 0.797064 1. 0.485928 0.570729 0 0.470919 0.646174 1. 0.224179 0.315696 0 0.439040 0.193504 0 0.408015 0.277912 1. 0.316104 0.463415 0 0.278309 0.168209 1. 0.214440 0.214435 1. 0.089768 0.418396 1. 0.678953 0.767832 1. 0.080336 0.583473 1. 0.363783 0.296127 1. 0.474240 0.562183 0 0.313445 0.577267 0 0.416055 0.443905 1. 0.529081 0.353826 0 0.953056 0.687662 1. 0.534725 0.448035 1. 0.469053 0.344394 0 0.759341 0.551592 0 0.705672 0.192199 1. 0.385925 0.775385 1. 0.590978 0.957385 1. 0.406519 0.360086 0 0.409022 0.042615 0 0.264147 0.657585 1. 0.758369 0.241638 1. 0.622380 0.622388 1. 0.321047 0.232168 0 0.739851 0.599356 0 0.555199 0.366750 0 0.608452 0.521576 0 0.352098 0.401168 00.530947 0.655606 1. 0.160045 0.160044 0 0.455582 0.518396 0 0.881988 0.658228 0 0.643511 0.153547 1. 0.499997 0.576923 0 0.575968 0.881942 0 0.923077 0.500003 0 0.449254 0.424046 1. 0.839782 0.727039 0 0.647902 0.598832 1. 0.444801 0.633250 1. 0.392066 0.572114 1. 0.242378 0.606705 1. 0.136231 0.743060 1. 0.711862 0.641568 0 0.834786 0.566598 1. 0.846154 0.500005 1. 0.538462 0.500002 1. 0.379800 0.679882 0 0.584638 0.295683 1. 0.459204 0.540793 0 0.331216 0.430082 0 0.672945 0.082478 0 0.671894 0.385152 1. 0.046944 0.312338 0 0.499995 0.884615 0 0.542438 0.491561 1. 0.540796 0.459207 1. 0.828690 0.363858 1. 0.785560 0.785565 0 0.686555 0.422733 1. 0.231226 0.553456 1. 0.465275 0.551965 0 0.378577 0.206844 0 0.567988 0.567994 0 0.668784 0.569918 1. 0.384513 0.547832 1. 0.288138 0.358432 1. 0.432012 0.432006 1. 0.424032 0.118058 1. 0.296023 0.703969 1. 0.525760 0.437817 1. 0.748406 0.128238 0 0.775821 0.684304 1. 0.919664 0.416527 0 0.327055 0.917522 1. 0.985677 0.596608 1. 0.356489 0.846453 0 0.500005 0.115385 1. 0.377620 0.377612 0 0.559095 0.202936 0 0.410895 0.947955 1. 0.187239 0.812757 1. 0.768774 0.446544 0 0.614075 0.224615 0 0.350415 0.649576 0 0.160218 0.272961 1. 0.454047 0.268948 1. 0.306671 0.538450 0 0.323228 0.323219 1. 0.839955 0.839956 1. 0.636217 0.703873 0 0.703977 0.296031 0 0.662936 0.256158 0 0.100243 0.665582 一,

# 1 楼答案

我非常怀疑对你的问题是否有任何严格的规定。首先，权重的限制/界限严格取决于输入数据表示、激活函数、神经元数量和输出函数。在这里，你可以依靠的是最佳情况下的经验法则

首先，考虑经典算法中的初始权值。权重量表的一些基本思想是，对于小层，在[-1,1]范围内使用它们，对于大层，将其除以大层中单位数的平方根。更复杂的方法由Bishop (1995)描述。根据这样的经验法则，我们可以推断，一个合理的范围（仅仅是比最初猜测的大一行磁振子）将是[-10,10]/sqrt(neurons_count_in_the_lower_layer)的形式

不幸的是，据我所知，温度选择要复杂得多，因为它是一个依赖于数据的因素，而不仅仅是基于拓扑的因素。在一些论文中，对某些特定时间序列预测的某些值提出了建议，但没有给出一般性的建议。在模拟注释“一般”（不仅仅适用于神经网络训练）中，已经提出了许多启发式选择，即：

If we know the maximum distance (cost function difference) between one neighbour and another then we can use this information to calculate a starting temperature. Another method, suggested in (13. Rayward-Smith, V.J., Osman, I.H., Reeves, C.R., Smith, G.D. 1996. Modern Heuristic Search Methods. John Wiley & Sons.), is to start with a very high temperature and cool it rapidly until about 60% of worst solutions are being accepted. This forms the real starting temperature and it can now be cooled more slowly. A similar idea, suggested in (5. Dowsland, K.A. 1995. Simulated Annealing. In Modern Heuristic Techniques for Combinatorial Problems (ed. Reeves, C.R.), McGraw-Hill, 1995), is to rapidly heat the system until a certain proportion of worse solutions are accepted and then slow cooling can start. This can be seen to be similar to how physical annealing works in that the material is heated until it is liquid and then cooling begins (i.e. once the material is a liquid it is pointless carrying on heating it). [from notes from University of Nottingham]

但是，选择最适合您的应用程序必须基于大量测试，就像机器学习中的大多数事情一样。如果你在处理这个问题，你真正关心的是经过良好训练的神经网络，那么对极限机器学习和极限学习机器（ELM）感兴趣似乎是合理的，在全局优化过程中进行神经网络训练，这保证了最佳可能的解决方案（未充分使用正则化成本函数）。模拟注释，作为一个交互的、贪婪的过程（以及反向传播），不能保证任何事情，只有启发式和经验法则

共 (1) 个答案

# 1 楼答案

我非常怀疑对你的问题是否有任何严格的规定。首先，权重的限制/界限严格取决于输入数据表示、激活函数、神经元数量和输出函数。在这里，你可以依靠的是最佳情况下的经验法则

首先，考虑经典算法中的初始权值。权重量表的一些基本思想是，对于小层，在[-1,1]范围内使用它们，对于大层，将其除以大层中单位数的平方根。更复杂的方法由Bishop (1995)描述。根据这样的经验法则，我们可以推断，一个合理的范围（仅仅是比最初猜测的大一行磁振子）将是[-10,10]/sqrt(neurons_count_in_the_lower_layer)的形式

不幸的是，据我所知，温度选择要复杂得多，因为它是一个依赖于数据的因素，而不仅仅是基于拓扑的因素。在一些论文中，对某些特定时间序列预测的某些值提出了建议，但没有给出一般性的建议。在模拟注释“一般”（不仅仅适用于神经网络训练）中，已经提出了许多启发式选择，即：

If we know the maximum distance (cost function difference) between one neighbour and another then we can use this information to calculate a starting temperature. Another method, suggested in (13. Rayward-Smith, V.J., Osman, I.H., Reeves, C.R., Smith, G.D. 1996. Modern Heuristic Search Methods. John Wiley & Sons.), is to start with a very high temperature and cool it rapidly until about 60% of worst solutions are being accepted. This forms the real starting temperature and it can now be cooled more slowly. A similar idea, suggested in (5. Dowsland, K.A. 1995. Simulated Annealing. In Modern Heuristic Techniques for Combinatorial Problems (ed. Reeves, C.R.), McGraw-Hill, 1995), is to rapidly heat the system until a certain proportion of worse solutions are accepted and then slow cooling can start. This can be seen to be similar to how physical annealing works in that the material is heated until it is liquid and then cooling begins (i.e. once the material is a liquid it is pointless carrying on heating it). [from notes from University of Nottingham]

但是，选择最适合您的应用程序必须基于大量测试，就像机器学习中的大多数事情一样。如果你在处理这个问题，你真正关心的是经过良好训练的神经网络，那么对极限机器学习和极限学习机器（ELM）感兴趣似乎是合理的，在全局优化过程中进行神经网络训练，这保证了最佳可能的解决方案（未充分使用正则化成本函数）。模拟注释，作为一个交互的、贪婪的过程（以及反向传播），不能保证任何事情，只有启发式和经验法则

Python中文网

有 Java 编程相关的问题?

多层神经网络上的java退火：异或实验

共 (1) 个答案

# 1 楼答案