纽比标准差法除以z

import tensorflow as tf import numpy as np from sklearn.datasets import load_boston from pprint import pprint def regularise(features): # Regularised features: reg_features = np.zeros(features.shape) for x in range(len(features)): for y in range(len(features[x])): reg_features[x][y] = (features[x][y] - np.mean(features[:, y])) / np.std(features[:, y]) return reg_features # Get the data total_features, total_prices = load_boston(True) # Keep 300 samples for training train_features = regularise(total_features[:300]) # Works OK train_prices = total_prices[:300] # Keep 100 samples for validation valid_features = regularise(total_features[300:400]) # Works OK valid_prices = total_prices[300:400] # Keep remaining samples as test set test_features = regularise(total_features[400:]) # Does not work test_prices = total_prices[400:]

2条回答

网友

1楼 · 编辑于 2024-10-01 07:50:05

通常，当这种情况发生时，第一个猜测是你将分子除以一个比它大的整型数（而不是一个浮点型），所以结果是0。然而，在这里情况并非如此。在

有时除法并不是按你所期望的那样（逐项），而是向量运算。然而，这里也不是这样。在

这里的问题是如何引用数据帧

reg_features[x][y]

在处理数据帧并将值重新分配给特定单元格时，您希望使用函数loc

你可以在这里阅读更多关于它的http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.loc.html

网友

2楼 · 编辑于 2024-10-01 07:50:05

正是数据total_features[400:]的子集导致了问题。如果您查看该数据，您将看到列total_features[400:, 1]和{}都是0。这会导致代码中出现问题，因为这些列的平均值和标准偏差都是0，结果是0/0。在

您可以使用^{}，而不是编写自己的正则化函数。该函数通过返回全部为0的列来处理常量列。在

您可以轻松验证scale与您的regularise执行相同的计算：

In [68]: test
Out[68]: 
array([[ 15.,   1.,   0.],
       [  3.,   4.,   5.],
       [  6.,   7.,   8.],
       [  9.,  10.,  11.],
       [ 12.,  13.,   1.]])

In [69]: regularise(test)
Out[69]: 
array([[ 1.41421356, -1.41421356, -1.20560706],
       [-1.41421356, -0.70710678,  0.        ],
       [-0.70710678,  0.        ,  0.72336423],
       [ 0.        ,  0.70710678,  1.44672847],
       [ 0.70710678,  1.41421356, -0.96448564]])

In [70]: from sklearn.preprocessing import scale

In [71]: scale(test)
Out[71]: 
array([[ 1.41421356, -1.41421356, -1.20560706],
       [-1.41421356, -0.70710678,  0.        ],
       [-0.70710678,  0.        ,  0.72336423],
       [ 0.        ,  0.70710678,  1.44672847],
       [ 0.70710678,  1.41421356, -0.96448564]])

下面显示函数如何处理由零组成的列：

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章