擅长:python、mysql、java
<p>两个问题。您不应该对多个列重复使用相同的<code>LabelEncoder</code>。否则,您将丢失映射,无法转换测试数据</p>
<pre><code>category_le = preprocessing.LabelEncoder()
day_of_week_le = preprocessing.LabelEncoder()
pd_district_le = preprocessing.LabelEncoder()
train_category = category_le.fit_transform(train.Category)
train_day_of_week = day_of_week_le.fit_transform(train.DayOfWeek)
train_pd_district = pd_district_le.fit_transform(train.PdDistrict)
train_X = np.hstack([train_category_mat, train_day_of_week_mat, pd_district_le])
test_category = category_le.transform(test.Category)
test_day_of_week = day_of_week_le.transform(test.DayOfWeek)
test_pd_district = pd_district_le.transform(test.PdDistrict)
</code></pre>