执行groupy后从数据中删除一些行

2024-09-30 01:24:29 发布

您现在位置:Python中文网/ 问答频道 /正文

我有以下数据集:

df = pd.DataFrame({'scientist':["Wendelaar Bonga"," Sjoerd E.", "Grätzel"," Michael", "Willett", "Walter C.",
                         "Kessler", "Ronald C.", "Witten, Edward", "Wang, Zhong Lin"],
           'SubjectField': ["Biomedical Engineering", "Inorganic & Nuclear Chemistry",
                            "Organic Chemistry", "Biomedical Engineering", "Developmental Biology",
                            "Mechanical Engineering & Transports", "Biomedical Engineering", "Microbiology",
                            "Cardiovascular System & Hematology", "Biomedical Engineering"]})

我想计算每个学科领域的科学家数量,并从我的数据中删除少于2名科学家的学科领域

x= df.groupby('SubjectField')['scientist'].count()
ans = x[x > 2]

这是我的代码,但我不知道如何删除提到的行:


Tags: 数据dataframedf领域pd科学家chemistryengineering
2条回答

试试这个:

mask = df.groupby('SubjectField')['SubjectField'].transform('count') > 2
filtered = df[mask]

您已经在正确的轨道上了,我刚刚添加了代码来删除不满足条件的行

import pandas as pd

df = pd.DataFrame({'scientist':["Wendelaar Bonga"," Sjoerd E.", "Grätzel"," Michael", "Willett", "Walter C.",
                         "Kessler", "Ronald C.", "Witten, Edward", "Wang, Zhong Lin"],
           'SubjectField': ["Biomedical Engineering", "Inorganic & Nuclear Chemistry",
                            "Organic Chemistry", "Biomedical Engineering", "Developmental Biology",
                            "Mechanical Engineering & Transports", "Biomedical Engineering", "Microbiology",
                            "Cardiovascular System & Hematology", "Biomedical Engineering"]})


x = df.groupby('SubjectField')['scientist'].count()

您可以使用drop和参数index来删除与条件不匹配的行

Tilde ~ is used as the negation to fetch the opposite of a condition

drop_idx = x[~(x > 2)].index.values
x = x.drop(index=drop_idx)

x将只包含计数大于2的行

相关问题 更多 >

    热门问题