擅长:python、mysql、java
<p>我注意到一个非常简单的错误:</p>
<pre><code>X_train=train.drop(['Station','StationIndex','dayofyear'],axis=1)
Y_train=train['Rainfall']
X_test=test.drop(['Station','StationIndex','dayofyear'],axis=1)
Y_test=test['Rainfall']
</code></pre>
<p>您尚未从培训数据中删除<code>Rainfall</code>列</p>
<p>我大胆假设一下,你在训练和测试中都能获得100%的准确率,对吗?这就是原因。您的模型可以看到,在训练数据的“降雨”列中出现的任何东西都是答案,因此它在测试过程中准确地做到了这一点,从而获得了完美的结果,但事实上它根本无法预测任何东西</p>
<p>试着像这样跑步:</p>
<pre><code>X_train=train.drop(['Station','StationIndex','dayofyear','Rainfall'],axis=1)
Y_train=train['Rainfall']
X_test=test.drop(['Station','StationIndex','dayofyear','Rainfall'],axis=1)
Y_test=test['Rainfall']
from sklearn import svm
model = svm.SVC(gamma='auto',kernel='linear')
model.fit(X_train, Y_train)
print('Accuracy on training set: {:.2f}%'.format(100*model.score(X_train, Y_train)))
print('Accuracy on testing set: {:.2f}%'.format(100*model.score(X_test, Y_test)))
</code></pre>