在分组中维护Pandas数据帧主排序顺序

2条回答

网友

1楼 · 编辑于 2024-05-15 20:27:51

value_counts默认情况下按降序对结果进行排序，因此groupby.value_counts应该这样做；如果需要查看每个国家的前n行，可以使用groupby.head从每个国家/地区获取前n行：

示例：

from io import StringIO
df = pd.read_csv(StringIO("""Country     Pattern
Hong Kong   def
Hong Kong   abc
Hong Kong   def
Hong Kong   ghi
Australia   ghi
Australia   jkl
Australia   jkl
Australia   abc
Australia   jkl"""), sep = "\s{2,}")

groupbycountry和do值\u counts结果为按每个组内计数降序排序的序列：

^{pr2}$

要查看0级索引中每个国家的前5项，请使用groupby.head，这将为每个国家获取前n行：

df.groupby("Country")['Pattern'].value_counts().groupby(level=0).head(2)

#Country    Pattern
#Australia  jkl        3
#           abc        1
#Hong Kong  def        2
#           abc        1
#Name: Pattern, dtype: int64

网友

2楼 · 编辑于 2024-05-15 20:27:51

你可以试试这样的方法：

seqdf.groupby('Country')['Pattern'].value_counts().to_frame('quantity').reset_index().sort_values(['Country', 'quantity'], ascending=[True, False])[:100]

要将每个国家的模式限制为10个并获得一个纯数据帧，请执行以下操作：

^{pr2}$

或者像这样：

seqdf.groupby(['Country', 'Pattern']).agg({'Pattern':'count'}).rename(columns={'Pattern':'quantity'}).groupby(level=0).head(10).reset_index().sort_values(['Country', 'quantity'], ascending=[True, False])

相关问题更多 >

编程相关推荐

热门问题

热门文章

在分组中维护Pandas数据帧主排序顺序

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >