<h2>获取原始数据</h2>
<p>您可以首先使用<code>requests</code>获取原始数据,然后分割数据并创建数据帧</p>
<pre><code>import pandas as pd
import requests
response = requests.get('http://www.jmulti.de/download/datasets/e6.dat')
data = response.text.split('\n')[11:]
data = [row.split() for row in data]
df = pd.DataFrame(data, columns=['Dp', 'R'], dtype=float).dropna()
</code></pre>
<h2>加上季度和年度</h2>
<p>然后我们可以像这样为每一行添加季度和年度</p>
<pre><code>datelist = pd.date_range(start='1972-06-30', end='1999-01-01', freq='3M')
df['quarter'] = datelist.quarter
df['year'] = datelist.year
</code></pre>
<p>注意,上面代码中的开始/结束日期目前是硬编码的,但是您可以使用类似这样的方法从原始数据中获取它们</p>
<pre><code># extract the period the data covers from the 2nd line of the file
period = data.split('\n')[1]
# get the dates for the start/end quarters for the period the data covers
start, end = pd.PeriodIndex([period[8:14].replace('Q','-Q'), period[18:24].replace('Q','-Q')], freq='Q').to_timestamp()
</code></pre>
<h2>样本输出</h2>
<p>这是输出的一个示例</p>
<pre><code> Dp R quarter year
0 -0.003133 0.083 2 1972
1 0.018871 0.083 3 1972
2 0.024804 0.087 4 1972
3 0.016278 0.087 1 1973
4 0.000290 0.102 2 1973
.. ... ... ... ...
102 0.024245 0.051 4 1997
103 -0.014647 0.047 1 1998
104 -0.002049 0.047 2 1998
105 0.002475 0.041 3 1998
106 0.023923 0.038 4 1998
[107 rows x 4 columns]
</code></pre>