简单的Keras神经网络无法学习问题的回答

简单的Keras神经网络无法学习

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

我试图用Keras复制<a href="http://neuralnetworksanddeeplearning.com/index.html" rel="nofollow noreferrer">Neural Networks and Deep Learning</a>中的一些示例，但是在基于第1章的体系结构训练网络时遇到了一些问题。目的是从MNIST数据集中对书写数字进行分类。体系结构： <ul> <li>784个输入（MNIST图像中的28*28像素各一个）</li> <li>由30个神经元组成的隐藏层</li> <li>10个神经元的输出层</li> <li>权重和偏差从平均值为0和标准偏差为1的高斯分布中初始化。在</li> <li>损失/成本函数为均方误差。在</li> <li>优化器是随机梯度下降。在</li> </ul> 超参数： <ul> <li>学习率=3.0</li> <li>批量=10</li> <li>时代=30</li> </ul> 我的代码： <pre class="lang-python prettyprint-override"><code>from keras.<a href="https://www.cnpython.com/pypi/dataset" class="inner-link">dataset</a>s import mnist from keras.models import Sequential from keras.layers import Dense from keras.optimizers import SGD from keras.initializers import RandomNormal # import data (x_train, y_train), (x_test, y_test) = mnist.load_data() # input image dimensions img_rows, img_cols = 28, 28 x_train = x_train.reshape(x_train.shape[0], img_rows * img_cols) x_test = x_test.reshape(x_test.shape[0], img_rows * img_cols) input_shape = (img_rows * img_cols,) x_train = x_train.astype('float32') x_test = x_test.astype('float32') x_train /= 255 x_test /= 255 print('x_train shape:', x_train.shape) print(x_train.shape[0], 'train samples') print(x_test.shape[0], 'test samples') # convert class vectors to binary class matrices num_classes = 10 y_train = keras.utils.to_categorical(y_train, num_classes) y_test = keras.utils.to_categorical(y_test, num_classes) print('y_train shape:', y_train.shape) # Construct model # 784 * 30 * 10 # Normal distribution for weights/biases # Stochastic Gradient Descent optimizer # Mean squared error loss (cost function) model = Sequential() layer1 = Dense(30, input_shape=input_shape, kernel_initializer=RandomNormal(stddev=1), bias_initializer=RandomNormal(stddev=1)) model.add(layer1) layer2 = Dense(10, kernel_initializer=RandomNormal(stddev=1), bias_initializer=RandomNormal(stddev=1)) model.add(layer2) print('Layer 1 input shape: ', layer1.input_shape) print('Layer 1 output shape: ', layer1.output_shape) print('Layer 2 input shape: ', layer2.input_shape) print('Layer 2 output shape: ', layer2.output_shape) model.summary() model.compile(optimizer=SGD(lr=3.0), loss='mean_squared_error', metrics=['accuracy']) # Train model.fit(x_train, y_train, batch_size=10, epochs=30, verbose=2) # Run on test data and output results result = model.evaluate(x_test, y_test, verbose=1) print('Test loss: ', result[0]) print('Test accuracy: ', result[1]) </code></pre> 输出（使用Python3.6和TensorFlow后端）： ^{pr2}$ （30个时代重复） <pre class="lang-python prettyprint-override"><code>Epoch 30/30 - 6s - loss: nan - acc: 0.0987 10000/10000 [==============================] - 0s 22us/step Test loss: nan Test accuracy: 0.098 </code></pre> 正如你所看到的，网络根本没有学习，我不知道为什么。据我所知，这些形状看起来还不错。我在做什么阻止网络学习？在 （顺便说一句，我知道交叉熵损失和softmax输出层会更好；但是，从链接的书来看，它们似乎没有必要。这本书在第一章中手动实现的网络学习成功；在继续学习之前，我试图复制这一点。）

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

简单的Keras神经网络无法学习

1 个回答

相关Python问题