我正在使用pandas从txt文件加载数据。有人能告诉我我的代码有什么问题吗
import sklearn.linear_model
wineQuality = pd.read_csv('winequality-all.txt', sep=",")
X = wineQuality.loc[:,("fixed.acidity","volatile.acidity","citric.acid","residual.sugar","chlorides","free.sulfur.dioxide","total.sulfur.dioxide","density","pH","sulphates","alcohol","color")]
y = wineQuality.loc[:,('response')]
X = X.drop(['color'], axis=1)
X = X.to_numpy();
y = y.to_numpy();
print(X)
print(y)
print(X.shape)
print(y.shape)
np.matmul(X,y);
mnk = sklearn.linear_model.LinearRegression().fit(X, y)
print('Score :',mnk.score(X,y))
print('Avg values :',mnk.predict(X.mean().reshape(1, -1)))
我的winequality-all.txt文件如下所示:
"fixed.acidity","volatile.acidity","citric.acid","residual.sugar","chlorides","free.sulfur.dioxide","total.sulfur.dioxide","density","pH","sulphates","alcohol","response","color"
7.4,0.7,0,1.9,0.076,11,34,0.9978,3.51,0.56,9.4,3,"red"
7.8,0.88,0,2.6,0.098,25,67,0.9968,3.2,0.68,9.8,3,"red"
7.8,0.76,0.04,2.3,0.092,15,54,0.997,3.26,0.65,9.8,3,"red"
...
我试图在我的X和y上使用像重塑(-1,1)或(1,-1)这样的方法,但对我不起作用
输出:
我的错误:
ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 5320 is different from 11)
如果有numpy数组,则需要使用
axis=0
指定平均值,否则将采用整个数组的总平均值:否则,将其保留为数据帧:
相关问题 更多 >
编程相关推荐