为什么阶数为1的多项式特征的线性回归会给出不同的结果？

from sklearn.linear_model import LinearRegression from sklearn.metrics import r2_score linear_reg = LinearRegression() linear_reg.fit(X_train_scaled, y_train) y_pred_train = linear_reg.predict(X_train_scaled) y_pred_val = linear_reg.predict(X_val_scaled) r2_train = r2_score(y_train, y_pred_train) r2_val = r2_score(y_val, y_pred_val) print('r2_train', r2_train) print('r2_val', r2_val)

from sklearn.preprocessing import PolynomialFeatures pf = PolynomialFeatures(1) X_train_poly = pf.fit_transform(X_train_scaled)[:, 1:] # ignore first col X_val_poly = pf.transform(X_val_scaled)[:, 1:] # ignore first col linear_reg = LinearRegression() linear_reg.fit(X_train_poly, y_train) y_pred_train = linear_reg.predict(X_train_poly) y_pred_val = linear_reg.predict(X_val_poly) r2_train = r2_score(y_train, y_pred_train) r2_val = r2_score(y_val, y_pred_val) print('r2_train', r2_train) print('r2_val', r2_val)

1条回答

网友

1楼 · 发布于 2024-06-28 14:56:07

Sklearn线性回归使用普通最小二乘优化将列车数据拟合到线性模型中，但不清楚Sklearn多项式特征使用什么。但基于其transform（）函数：

Prefer CSR over CSC for sparse input (for speed), but CSC is required if the degree is 4 or higher. If the degree is less than 4 and the input format is CSC, it will be converted to CSR, have its polynomial features generated, then converted back to CSC. (see: https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.PolynomialFeatures.html)

假设多项式特征使用普通最小二乘优化，您仍然会得到相同的结果，但略有不同（就像您的一样），因为压缩稀疏行（CSR）方法会损害浮点值（换句话说，截断/近似误差）

相关问题更多 >

编程相关推荐

热门问题

热门文章