<p>创建了一个名为<code>t</code>的新列,以维护<code>first occurence of each group</code>与其值之间的<code>timedelta</code>。<br/>
{<CD4}}是每个组将考虑^ {CD5}}的条件,如果在每个数据集中有超过一天的差异,那么只需修改^ {CD4}}。p>
<pre><code>df['t'] = pd.to_datetime(df.Time, format='%H:%M')
df.Date = pd.to_datetime(df.Date, dayfirst=True)
cond = df.groupby(['Train','ID']
).t.transform('first') - df.groupby(
['Train','ID']).t.transform(
lambda x: x.values) > pd.Timedelta('0 days')
df.Date = df.Date.mask(cond,df.Date + pd.Timedelta(days=1))
df = df.drop('t',1)
df
</code></pre>
<p><strong>输出</strong></p>
<pre><code> Date Train Station Time ID
0 2020-10-02 Flixtrain London 10:40 1
1 2020-10-02 Flixtrain Berlin 20:30 1
2 2020-10-02 Flixtrain Hamburg 23:45 1
3 2020-10-02 VSOE Amesterdam 21:30 2
4 2020-10-03 VSOE Cologne 00:50 2
5 2020-10-03 VSOE Berlin 04:30 2
6 2020-10-02 ICE-220 Warschau 12:35 3
7 2020-10-02 ICE-220 Breslau 17:40 3
8 2020-10-02 ICE-220 Prag 23:13 3
9 2020-10-02 ICE-342 Wien 00:35 4
10 2020-10-02 ICE-342 Salzburg 07:42 4
11 2020-10-02 ICE-342 Munich 13:13 4
</code></pre>