<p>用<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.isin.html" rel="nofollow noreferrer">^{<cd2>}</a>替换<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.mask.html" rel="nofollow noreferrer">^{<cd1>}</a>中匹配的值以替换缺少的值,并用以前的值向前填充:</p>
<pre><code>g = df1['indexID'] .mask(df1['indexID'].isin(df2['matchID'])).ffill().astype(int)
print (g)
0 0
1 0
2 0
3 3
4 3
5 3
6 6
7 7
8 7
9 7
10 10
Name: indexID, dtype: int32
</code></pre>
<p>然后将<code>groupby</code>与<code>join</code>一起使用:</p>
<pre><code>#if want grouping only be new Series g
df = df1.groupby(g).agg({'details':' '.join, 'id':'first'}).reset_index()
print (df)
indexID details id
0 0 'series of numbers' 'series of numbers' 'serie... 78
1 3 'series of numbers' 'series of numbers' 'serie... 120
2 6 'series of numbers' 110
3 7 'series of numbers' 'series of numbers' 'serie... 109
4 10 'series of numbers' 79
</code></pre>
<hr/>
<pre><code>#or also by id column
df = df1.groupby(['id',g], sort=False)['details'].agg(' '.join).reset_index()
print (df)
id indexID details
0 78 0 'series of numbers' 'series of numbers' 'serie...
1 120 3 'series of numbers' 'series of numbers' 'serie...
2 110 6 'series of numbers'
3 109 7 'series of numbers' 'series of numbers' 'serie...
4 79 10 'series of numbers'
</code></pre>