标准化包含太大值的数据集

2024-06-13 22:23:52 发布

您现在位置:Python中文网/ 问答频道 /正文

我将特征标准化为平均值=0和sd=1预处理.scaleastype('float64')。我收到了以下警告:

UserWarning: Numerical issues were encountered when centering the data and might not be solved. dataset may contain too large values. You may need to prescale your features. warnings.warn("Numerical issues were encountered "

以下是数据集的示例:

    col1    col2    col3    col4    col5    col6    col7    col8    col9    col10   col11   col12   col13
0   327 143.04  123.66  101.71  89.36575914 0.668110013 84.13713837 588.103818  633.6584113 525.5505746 132.966095  13.05099964 131.7220566
1   1010    188.98  176.78  137.33  89.36575914 0.620949984 40.52060699 1413.802012 3705.255352 1641.459378 106.3353716 7.69299984  472.4249759
2   1485    166.67  141.72  111.07  98.91169739 0.979290009 100 3580.441388 4327.644518 3242.16829  111.2140427 13.05300045 1164.119187
3   78  54.27   83.01   161.74  95.0061264  0.968744297 100 35644.07894 37765.71684 15667.95157 106.3043671 7.448999882 850.651571
4   591 132.86  121.22  108.13  103.231369  1.039739966 100 9348.743837 10699.19772 7144.242782 101.7313309 8.788999557 1382.113557
5   562 134.98  141.72  141.15  89.36575914 0.968744297 100 3046.147835 3710.575743 2716.801411 106.3353716 18.26099968 1076.131188
6   1030    110.83  79.08   50.87   89.36575914 0.952409983 97.35466766 11348.70932 11928.21847 7637.253514 102.3456802 9.793620323 1164.119187
7   534 109.06  109.14  106.12  89.36575914 0.968744297 100 43007.67453 54008.70819 29971.03064 106.3353716 5.602000237 1164.119187

什么是处方?我有什么选择?在


Tags: the警告numerical特征sdmay平均值when
1条回答
网友
1楼 · 发布于 2024-06-13 22:23:52

我使用StandardScaler解决了这个问题,并以建议的here以下代码为例:

from sklearn import preprocessing
# Get column names first
names = df.columns
# Create the Scaler object
scaler = preprocessing.StandardScaler()
# Fit your data on the scaler object
scaled_df = scaler.fit_transform(df)
scaled_df = pd.DataFrame(scaled_df, columns=names)

相关问题 更多 >