我有一个简单的pytorch模型。你知道吗
model = Network()
其细节如下:
Network(
(hidden): Linear(in_features=784, out_features=256, bias=True)
(output): Linear(in_features=256, out_features=10, bias=True)
(sigmoid): Sigmoid()
(softmax): Softmax(dim=1)
)
共有3层神经元。1个输入(786个神经元),1个隐藏(256个神经元)和1个输出(10个神经元)。因此将有两个重量层。所以两个权重层必须有两个偏差(简单地说就是两个浮点数),对吗?(如果我错了,请纠正我)。你知道吗
在初始化我的网络之后,我对这两个偏差值很好奇。所以我想检查隐藏层的偏移值,所以我写了:
model.hidden.bias
结果却出乎我的意料!我真的期望一个值!这就是我真正得到的:
tensor([-1.6868e-02, -3.5661e-02, 1.2489e-02, -2.7880e-02, 1.4025e-02,
-2.6085e-02, 1.2625e-02, -3.1748e-02, 5.0335e-03, 3.8031e-03,
-3.1648e-02, -3.4881e-02, -2.0026e-02, 1.9728e-02, 6.2461e-03,
9.3936e-04, -5.9270e-03, -2.7183e-02, -1.9850e-02, -3.5693e-02,
-1.9393e-02, 2.6555e-02, 2.3482e-02, 2.1230e-02, -2.2175e-02,
-2.4386e-02, 3.4848e-02, -2.6044e-02, 1.3575e-02, 9.4125e-03,
3.0012e-02, -2.6078e-02, 7.1615e-05, -1.7061e-02, 6.6355e-03,
-3.4966e-02, 2.9311e-02, 1.4060e-02, -2.5763e-02, -1.4020e-02,
2.9852e-02, -7.9176e-03, -1.8396e-02, 1.6927e-02, -1.1001e-03,
1.5595e-02, 1.2169e-02, -1.2275e-02, -2.9270e-03, -6.5685e-04,
-2.4297e-02, 3.0048e-02, 2.9692e-03, -2.5398e-02, 2.9955e-03,
-9.3653e-04, -1.2932e-02, 2.4232e-02, -3.5182e-02, -1.6163e-02,
3.0025e-02, 3.1227e-02, -8.2498e-04, 2.7102e-02, -2.3830e-02,
-3.4958e-02, -1.1886e-02, 1.6097e-02, 1.4579e-02, -2.6744e-02,
1.1900e-02, -3.4855e-02, -4.2208e-03, -5.2035e-03, 1.7055e-02,
-4.8580e-03, 3.4088e-03, 1.6923e-02, 3.5570e-04, -3.0478e-02,
8.4647e-03, 2.5704e-02, -2.3255e-02, 6.9396e-03, -1.2521e-03,
-9.4101e-03, -2.5798e-02, -1.4438e-03, -7.2684e-03, 3.5417e-02,
-3.4388e-02, 1.3706e-02, -5.1430e-03, 1.6174e-02, 1.8135e-03,
-2.9018e-02, -2.9083e-02, 7.4100e-03, -2.7758e-02, 2.4367e-02,
-3.8350e-03, 9.4390e-03, -1.0844e-02, 1.6381e-02, -2.5268e-02,
1.3553e-02, -1.0545e-02, -1.3782e-02, 2.8519e-02, 2.3630e-02,
-1.9703e-02, -2.0147e-02, -1.0485e-02, 2.4637e-02, 1.9989e-02,
5.6601e-03, 1.9121e-02, -1.5286e-02, 2.5996e-02, -2.9833e-02,
-2.9458e-02, 2.3944e-02, -3.0107e-02, -1.2307e-02, -1.8419e-02,
3.3551e-02, 1.2396e-02, 2.9356e-02, 3.3274e-02, 5.4677e-03,
3.1715e-02, 1.3361e-02, 3.3042e-02, 2.7843e-03, 2.2837e-02,
-3.4981e-02, 3.2355e-02, -2.7658e-03, 2.2184e-02, -2.0203e-02,
-3.3264e-02, -3.4858e-02, 1.0820e-03, -1.4279e-02, -2.8041e-02,
4.1962e-03, 2.4266e-02, -3.5704e-02, -2.6172e-02, 2.3335e-02,
2.0657e-02, -3.0387e-03, -5.7096e-03, -1.1062e-02, 1.3450e-02,
-3.3965e-02, 1.9623e-03, -2.0067e-02, -3.3858e-02, -2.1931e-02,
-1.5414e-02, 2.4454e-02, 2.5668e-02, -1.1932e-02, 5.7540e-04,
1.5130e-02, 1.3916e-02, -2.1521e-02, -3.0575e-02, 1.8841e-02,
-2.3240e-02, -2.7297e-02, -3.2668e-02, -1.5544e-02, -5.9408e-03,
3.0241e-02, 2.2039e-02, -2.4389e-02, 3.1703e-02, 3.5305e-02,
-2.7501e-03, 2.0154e-02, -5.3489e-03, 1.4177e-02, 1.6829e-02,
3.3066e-02, -1.3425e-02, -3.2565e-02, 6.5624e-03, -1.5681e-02,
2.3047e-02, 6.5880e-03, -3.3803e-02, 2.3790e-02, -5.5061e-03,
2.9413e-02, 1.2290e-02, -1.0958e-02, 1.2680e-03, 1.3343e-02,
6.6689e-03, -2.2975e-03, -1.2068e-02, 1.6523e-02, -3.1612e-02,
-1.7529e-02, -2.2220e-02, -1.4723e-02, -1.3495e-02, -5.1805e-03,
-2.9620e-02, 3.0571e-02, -3.0999e-02, 3.3681e-03, 1.3579e-02,
1.4837e-02, 1.5694e-02, -1.1178e-02, 4.6233e-03, -2.2583e-02,
-3.5281e-03, 3.0918e-02, 2.6407e-02, 1.5822e-04, -3.0181e-03,
8.6989e-03, 2.8998e-02, -1.5975e-02, -3.1574e-02, -1.5609e-02,
1.0472e-02, 5.8976e-03, 7.0131e-03, -3.2047e-02, 2.6045e-02,
-2.8882e-02, -2.2121e-02, -3.2960e-02, 1.8268e-02, 3.0984e-02,
1.4824e-02, 3.0010e-02, -5.7523e-03, -2.0017e-02, 4.8700e-03,
1.4997e-02, -1.4898e-02, 6.8572e-03, 9.7713e-03, 1.3410e-02,
4.9619e-03, 3.1016e-02, 3.1240e-02, -3.0203e-02, 2.1435e-02,
2.7331e-02], requires_grad=True)
有人能给我解释一下这种行为吗?为什么我得到256个值而不是一个?你知道吗
编辑1:
以下是我对层次的理解:
对于整个神经元层来说,偏差只是一个值。我说得对吗?但是我看到的输出是256个值?为什么?皮托克认为我对每个神经元都有偏见吗?这样可以吗?
看看这个:
这会给你一个巨大的列表,像这样:
您将知道所有参数并得到基础,但是如果打印模型,您将得到:
bias设置为
True
或False
的信息,表示是否实际使用它们。您也可以通过修改第一段代码来检查最后一段代码,但希望这会有所帮助。你知道吗所以首先,重要的是要了解其中一层的内部情况。当你写:
您正在为输入和输出之间的线性关系建模。你可能对基础数学很熟悉:
但是,不是“斜率”和“y截距”,而是权重矩阵和偏差项。这仍然是一个线性关系,但矩阵作为我们的输入和输出。你知道吗
Y是我们的输出,M是我们的权重矩阵,X是我们的输入,B是我们的偏差。您定义输入是(nx784)矩阵,输出是(nx256)矩阵(N是样本数)。你知道吗
如果你熟悉矩阵乘法,这意味着我们的权重矩阵是(784 X 256)。MX的输出将是一个(nx256)矩阵,因此我们的偏差项也必须是(nx256),MX+B才能计算出来。你知道吗
一般来说,偏差项中的值的数量将与
out_features
的数量相同。你知道吗相关问题 更多 >
编程相关推荐