用Pandas识别连续的NaN第2部分

df = pd.DataFrame({'a':[1,2,np.NaN, np.NaN, np.NaN, 6,7,8,9,10,np.NaN,np.NaN,13,14]}) df Out[38]: a 0 1 1 2 2 NaN 3 NaN 4 NaN 5 6 6 7 7 8 8 9 9 10 10 NaN 11 NaN 12 13 13 14

1条回答

网友

1楼 · 发布于 2024-10-06 12:19:22

我找到了一个解决办法。它很难看，但它确实起了作用。我希望您没有海量数据，因为它的性能可能不太好：

df = pd.DataFrame({'a':[1,2,np.NaN, np.NaN, np.NaN, 6,7,8,9,10,np.NaN,np.NaN,13,14]})
df1 = df.a.isnull().astype(int).groupby(df.a.notnull().astype(int).cumsum()).sum()

# Determine the different groups of NaNs. We only want to keep the 1st. The 0's are non-NaN values, the 1's are the first in a group of NaNs. 
b = df.isna()
df2 = b.cumsum() - b.cumsum().where(~b).ffill().fillna(0).astype(int)
df2 = df2.loc[df2['a'] <= 1]

# Set index from the non-zero 'NaN-count' to the index of the first NaN
df3 = df1.loc[df1 != 0]
df3.index = df2.loc[df2['a'] == 1].index

# Update the values from df3 (which has the right values, and the right index), to df2 
df2.update(df3)

NaN group thingy受到以下答案的启发：这来自Thisanswer

相关问题更多 >

编程相关推荐

热门问题

热门文章

用Pandas识别连续的NaN第2部分

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >