<p>作为替代方案,您可以保持<code>Time</code>序列的原样,并且仅将行添加到您所描述的缺失位置,使用行之间的时间差大于12分钟作为条件。作为一种权衡,取决于你以前的现有价值观,你不会在12分钟内得到完美的结果。泛型行和下一个现有值之间的时间片</p>
<pre><code>import pandas as pd
df = pd.DataFrame([
["2014/04/02 00:00:09",0.5],
["2014/04/02 00:12:20",1.1],
["2014/04/02 00:24:05",0.48],
["2014/04/02 00:36:51",2.3],
["2014/04/02 01:00:08",4.1],
["2014/04/02 01:12:26",5.0],
["2014/04/02 01:24:02",3.2],
["2014/04/02 02:44:02",1.2], # added for test
["2014/04/02 03:54:02",7.72] # added for test
])
df[0] = pd.to_datetime(df[0])
print(df)
delta = df[0].diff()
diff_idx = delta.where(delta > pd.Timedelta("00:12:59"))
print(delta)
idx = df[diff_idx.notnull()].index
td = pd.Timedelta("00:12:00")
for k in idx:
deltaT = (df.loc[k, 0] - df.loc[k - 1, 0])
num_missrows = deltaT // td
num_missrows -= (-1, 0)[deltaT % td == pd.Timedelta(0)] # don't overlap last time value
new_avg = df.loc[k - 1, 1] # previous existing Rain value
for i in range(1, num_missrows):
avg = (df.loc[k, 1] + new_avg)/2
new_row = [[df.loc[k-1, 0] + i * td, avg]]
new_avg = avg
df = df.append(new_row)
df = df.sort_values(by=0).reset_index(drop=True)
print(df)
</code></pre>
<p>来自<strong>df的输出</p>
<pre><code> 0 1
0 2014-04-02 00:00:09 0.50000
1 2014-04-02 00:12:20 1.10000
2 2014-04-02 00:24:05 0.48000
3 2014-04-02 00:36:51 2.30000
4 2014-04-02 00:48:51 3.20000 # added row
5 2014-04-02 01:00:08 4.10000
6 2014-04-02 01:12:26 5.00000
7 2014-04-02 01:24:02 3.20000
8 2014-04-02 01:36:02 2.20000 # added row
9 2014-04-02 01:48:02 1.70000 # added row
10 2014-04-02 02:00:02 1.45000 # added row
11 2014-04-02 02:12:02 1.32500 # added row
12 2014-04-02 02:24:02 1.26250 # added row
13 2014-04-02 02:36:02 1.23125 # added row, Not 12min. (~8min. diff.)
14 2014-04-02 02:44:02 1.20000
15 2014-04-02 02:56:02 4.46000 # added row
16 2014-04-02 03:08:02 6.09000 # added row
17 2014-04-02 03:20:02 6.90500 # added row
18 2014-04-02 03:32:02 7.31250 # added row
19 2014-04-02 03:44:02 7.51625 # added row, Not 12min. (~10min. diff.)
20 2014-04-02 03:54:02 7.72000
</code></pre>