在DataFrameGroupBy对象的组内切片

Grouped_object = df.groupby(['col1', 'col2']) def delete_rows(group): pos_min_notna = group[group['cumsum'].notna()].index[0] return group[pos_min_notna:] new_df = Grouped_object.apply(delete_rows)

1条回答

网友

1楼 · 发布于 2024-09-20 04:01:37

在Pandas中，您必须非常小心索引（loc）和索引位置（iloc）。把这件事说清楚总是个好主意。你知道吗

This answer对差异有一个很好的概述

Grouped_object = df.groupby(['col1', 'col2']) 

def delete_rows(group):
  pos_min_notna = group[group['cumsum'].notna()].index[0]  # returns value of the index = loc
  return group.loc[pos_min_notna:]  # make loc explicit

new_df = Grouped_object.apply(delete_rows)  # this dataframe has a messed up index :)

最小示例表现出不想要的行为

df = pd.DataFrame([[1,2,3], [2,4,6], [2,4,6]], columns=['a', 'b', 'c'])

# Drop the first row of every group
df.groupby('a').apply(lambda g: g.iloc[1:])

# Identical results as:
df.groupby('a').apply(lambda g: g[1:])

# Return anything from any group with index 1 or higher
# This is nonsense with a static index in a sorted df. But examples huh
df.groupby('a').apply(lambda g: g.loc[1:])

相关问题更多 >

编程相关推荐

热门问题

热门文章

在DataFrameGroupBy对象的组内切片

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >