<p>您可以在<code>df['att1']</code>中计数<code>NaN</code>,减法<code>1</code>,然后它用作<code>limits</code>到<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.fillna.html" rel="nofollow">^{<cd5>}</a>的参数:</p>
<pre><code>import pandas as pd
import numpy as np
df = pd.DataFrame([1, 2, np.nan, np.nan, np.nan, np.nan, 3] , columns=['att1'])
print df
att1
0 1
1 2
2 NaN
3 NaN
4 NaN
5 NaN
6 3
s = df['att1'].isnull().sum() - 1
df['att1'] = df['att1'].fillna('missing', limit=s)
print df
att1
0 1
1 2
2 missing
3 missing
4 missing
5 NaN
6 3
</code></pre>
<p>编辑:</p>
<p>现在情况更复杂了。</p>
<p>因此,首先设置helper column<code>count</code>,用于通过<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.isnull.html" rel="nofollow">^{<cd8>}</a>、<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.shift.html" rel="nofollow">^{<cd9>}</a>、<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.astype.html" rel="nofollow">^{<cd10>}</a>和<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.cumsum.html" rel="nofollow">^{<cd11>}</a>计算列<code>att1</code>的连续值。然后<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.groupby.html" rel="nofollow">^{<cd12>}</a>按此列<code>count</code>和<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.fillna.html" rel="nofollow">^{<cd5>}</a>:</p>
<pre><code>import pandas as pd
import numpy as np
df = pd.DataFrame([1, 2, np.nan, np.nan, np.nan, np.nan, 3, 4
, np.nan, np.nan, np.nan, 5], columns=['att1'])
print df
df['count'] = (df['att1'].isnull() != df['att1'].isnull().shift()).astype(int).cumsum()
print df
att1 count
0 1 1
1 2 1
2 NaN 2
3 NaN 2
4 NaN 2
5 NaN 2
6 3 3
7 4 3
8 NaN 4
9 NaN 4
10 NaN 4
11 5 5
</code></pre>
<pre><code>def f(x):
att = x['att1'].isnull()
if(att.all()):
return x['att1'].fillna('missing', limit=att.sum() - 1)
else:
return x['att1']
print df.groupby(['count']).apply(f).reset_index(drop=True)
0 1
1 2
2 missing
3 missing
4 missing
5 NaN
6 3
7 4
8 missing
9 missing
10 NaN
11 5
Name: att1, dtype: object
</code></pre>
<p>解释列<code>count</code>:</p>
<pre><code>print (df['att1'].isnull() != df['att1'].isnull().shift())
0 True
1 False
2 True
3 False
4 False
5 False
6 True
7 False
8 True
9 False
10 False
11 True
Name: att1, dtype: bool
</code></pre>
<pre><code>print (df['att1'].isnull() != df['att1'].isnull().shift()).astype(int)
0 1
1 0
2 1
3 0
4 0
5 0
6 1
7 0
8 1
9 0
10 0
11 1
Name: att1, dtype: int32
</code></pre>
<pre><code>print (df['att1'].isnull() != df['att1'].isnull().shift()).astype(int).cumsum()
0 1
1 1
2 2
3 2
4 2
5 2
6 3
7 3
8 4
9 4
10 4
11 5
Name: att1, dtype: int32
</code></pre>