<p>首先通过<a href="http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.isna.html" rel="nofollow noreferrer">^{<cd1>}</a>测试缺失的行,然后通过与mask的<a href="http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.shift.html" rel="nofollow noreferrer">^{<cd3>}</a>ed值比较获得<code>groups</code>的第一个值,并通过<code>ffill</code>创建由先前值填充的新列。上次按<a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.where.html" rel="nofollow noreferrer">^{<cd5>}</a>设置新列:</p>
<pre><code>#for test missing values
m = df['Ctgr'].isna()
#for test emtsy strings
#m = df['Ctgr'].eq('')
df['subctgr'] = np.where(m,np.nan,df['Ctgr'].where(m.ne(m.shift())).ffill())
print (df)
Ctgr subctgr
0 A A
1 B A
2 B A
3 C A
4 NaN NaN
5 D D
6 E D
7 F D
</code></pre>
<p>详细信息:</p>
<pre><code>print (df.assign(m = df['Ctgr'].isna(),
mask = m.ne(m.shift()),
first = df['Ctgr'].where(m.ne(m.shift())),
ffill = df['Ctgr'].where(m.ne(m.shift())).ffill(),
subctgr = np.where(m,np.nan,df['Ctgr'].where(m.ne(m.shift())).ffill())))
Ctgr m mask first ffill subctgr
0 A False True A A A
1 B False False NaN A A
2 B False False NaN A A
3 C False False NaN A A
4 NaN True True NaN A NaN
5 D False True D D D
6 E False False NaN D D
7 F False False NaN D D
</code></pre>