使用OneHotEncod时出现错误“应为2D数组，而应为1D数组”

# Import Libraries import numpy as np import matplotlib.pyplot as plt import pandas as pd # Import dataset dataset = pd.read_csv('Data2.csv') X = dataset.iloc[:, :-1].values y = dataset.iloc[:, 5].values df_X = pd.DataFrame(X) df_y = pd.DataFrame(y) # Replace Missing Values from sklearn.preprocessing import Imputer imputer = Imputer(missing_values = 'NaN', strategy = 'mean', axis = 0) imputer = imputer.fit(X[:, 3:5 ]) X[:, 3:5] = imputer.transform(X[:, 3:5]) # Encoding Categorical Data "Name" from sklearn.preprocessing import LabelEncoder, OneHotEncoder labelencoder_x = LabelEncoder() X[:, 0] = labelencoder_x.fit_transform(X[:, 0]) # Transform into a Matrix onehotencoder1 = OneHotEncoder(categorical_features = [0]) X[:, 0] = onehotencoder1.fit_transform(X[:, 0]).toarray() # Encoding Categorical Data "University" from sklearn.preprocessing import LabelEncoder labelencoder_x1 = LabelEncoder() X[:, 1] = labelencoder_x1.fit_transform(X[:, 1])

# Encoding Categorical Data "Name" from sklearn.preprocessing import LabelEncoder, OneHotEncoder labelencoder_x = LabelEncoder() X[:, 0] = labelencoder_x.fit_transform(X[:, 0]) # Transform into a Matrix onehotencoder1 = OneHotEncoder(categorical_features = [0]) X[:, 0] = onehotencoder1.fit_transform(X[:, 0]).toarray() # Encoding Categorical Data "University" from sklearn.preprocessing import LabelEncoder, OneHotEncoder labelencoder_x1 = LabelEncoder() X[:, 1] = labelencoder_x1.fit_transform(X[:, 1]) # Transform into a Matrix onehotencoder2 = OneHotEncoder(categorical_features = [1]) X[:, 1] = onehotencoder1.fit_transform(X[:, 1]).toarray()

File "/Users/jim/anaconda3/lib/python3.6/site-packages/sklearn/utils/validation.py", line 441, in check_array "if it contains a single sample.".format(array)) ValueError: Expected 2D array, got 1D array instead: array=[ 2. 1. 3. 2. 3. 5. 5. 0. 4. 0.]. Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

3条回答

网友

1楼 · 编辑于 2024-09-27 09:37:17

这是在中引发的sklearn OneHotEncoder问题 https://github.com/scikit-learn/scikit-learn/issues/3662。大多数scikit学习估计器需要2D数组而不是1D数组。

标准做法是包含多维数组。由于您在categorical_features = [0]中指定了要视为onehotcoding的分类列，因此可以将下一行重写为以下内容，以获取整个数据集或其中的一部分。它将只考虑用于分类到伪转换的第一列，同时仍有多维数组可供使用。

onehotencoder1 = OneHotEncoder(categorical_features = [0])
X = onehotencoder1.fit_transform(X).toarray()

（我希望您的数据集不再具有分类值。我建议你先标记所有内容，然后再标记一个hotencode。

网友

2楼 · 编辑于 2024-09-27 09:37:17

您需要重新调整数据的形状，因为方法需要前面提到的多维数组。X=X.整形（-1,1）对我也有效。

网友

3楼 · 编辑于 2024-09-27 09:37:17

我得到了同样的错误，在错误消息之后，有一个建议如下：

"Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample."

因为我的数据是一个数组，所以我使用了X.values.reshape(-1,1)并且它可以工作。（还有人建议使用X.values.reshape，而不是X.reshape）。

相关问题更多 >

编程相关推荐

热门问题

热门文章