这是我写的代码,我正在尝试将非数值数据转换为数值。但是它返回一个错误ValueError:无法将大小为205的序列复制到维数为26的数组轴上 数据来自http://archive.ics.uci.edu/ml/datasets/Automobile
automobile = pd.read_csv('imports-85.csv', names = ["symboling",
"normalized-losses", "make", "fuel", "aspiration", "num-of-doors", "body-
style", "drive-wheels", "engine-location", "wheel-base", "length", "width",
"height", " curb-weight", "engine-type", "num-of-cylinders","engine-
size","fuel-system","bore","stroke"," compression-ratio","horsepower","peak-
rpm","city-mpg","highway-mpg","price"])
X = automobile.drop('symboling',axis=1)
y = automobile['symboling']
le = preprocessing.LabelEncoder()
le.fit([automobile])
print (le)
fit
方法接受一个[n_samples]
的数组,请参见docs。你在一个列表中传递整个数据帧。我很确定如果您打印数据帧(automobile.shape
)的形状,它将显示一个(205, 26)
的形状如果你想对你的数据进行编码,你需要一次只写一列。
le.fit(automobile['make'])
。在请注意,这不是对分类数据进行编码的正确方法,顾名思义,
LabelEncoder
是为标签而不是输入特性而设计的。在scikit学习当前状态时,您应该使用OneHotEncoder
。下一个版本中有一个categorical encoder的计划相关问题 更多 >
编程相关推荐