回答此问题可获得 20 贡献值,回答如果被采纳可获得 50 分。
<p>我正在尝试应用不同的转换来测试这个数据集的线性回归模型</p>
<pre><code>import pandas as pd
import numpy as np
import seaborn as sns
data = {'Year': [1830, 1905, 1930, 1947, 1952, 1969],
'Speed mph': [30,130,400,760,1500,25000],
'Means of attaining speed': ['Railroad', 'Rairoad'
, 'Airplane', 'Airplane', 'Airplane','Spaceship']
}
df = pd.DataFrame (data, columns = ['Year','Speed mph','Means of attaining speed'])
x = df['Year'].values
y = df['Speed mph'].values
df['U2'] = np.power(2,df['Speed mph'])
u = df['U2'].values
#regression part
slope, intercept, r_value, p_value, std_err = stats.linregress(x,u)
line = slope*x+intercept
plt.plot(x, line, 'r', label='r_value={:.2f} p_value {:.2f}'.format(r_value,p_value))
#end
plt.scatter(x,u, color="k")
plt.title('${Y^2}$ vs X',fontsize=24)
plt.xlabel('Year,X',fontsize=14)
plt.ylabel('${Y^2}$',fontsize=14)
plt.tick_params(axis='both',labelsize=14)
plt.legend(fontsize=9)
plt.show()
</code></pre>
<p>这将返回-0.90的R平方值和p值=0.01。P值很重要,但为什么为负-0.90?希望有人能教育我。
多谢各位</p>