基于逗号获取数组列中元素的计数,并将计数转换为自己的列

2024-09-28 03:20:49 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个数据框,其中一列在我请求唯一值时输出以下内容(我最初考虑的是,如果组合较少,则手动映射计数):

df.amenities.unique()
array(['{TV,Wifi,Kitchen,Elevator,Heating,Washer,"First aid kit","Fire extinguisher",Essentials,Hangers,"Hair dryer",Iron,"Laptop friendly workspace","Private entrance"}',
       '{TV,Wifi,Kitchen,"Free parking on premises","Indoor fireplace",Heating,"Family/kid friendly",Washer,"First aid kit","Fire extinguisher",Essentials,"Lock on bedroom door",Hangers,"Hair dryer",Iron,"Laptop friendly workspace","Private entrance"}'])

为了处理这个便利设施阵列,我决定首先去掉引号:

df['amenities'] = df['amenities'].str.replace('"', '')

我的策略是计算每个数组元素中出现的逗号数,添加1以说明后面缺少的逗号,并使用reset_index命名我希望在其中显示计数的列

(df['amenities'].str.count(',').add(1).sum().reset_index(name='amenities_count'))

这不太有效,因为我得到了错误:

AttributeError: 'numpy.int64' object has no attribute 'reset_index'

如果可能的话,你能解释一下为什么这不是一个好的方法,什么是一个好的选择

谢谢你抽出时间

回应伯纳德:

Dataframe:

    Apt    Counties    amenities
    S1       C1          {TV, "Kitchen", "WiFi"}
    S1       C1          {"Hair dryer"}
    S2       C1          {"Heating", Essentials}
    S2       C2          {"Cable", Kitchen, "WiFi"}

Output:

    Apt    Counties    amenities                       amenities_counts
    S1       C1          {TV, "Kitchen", "WiFi"}        3
    S1       C1          {"Hair dryer"}                 1
    S2       C1          {"Heating", Essentials}        2
    S2       C2          {"Cable", Kitchen, "WiFi"}     3

Tags: dfindextvwifiresets2c1essentials
1条回答
网友
1楼 · 发布于 2024-09-28 03:20:49

作为示例,计算','加1并将其分配给新列

df['amenities_count'] = df.amenities.str.count(',').add(1)    

Out[1274]:
  Apt Counties                   amenities  amenities_count
0  S1       C1     {TV, "Kitchen", "WiFi"}                3
1  S1       C1              {"Hair dryer"}                1
2  S2       C1     {"Heating", Essentials}                2
3  S2       C2  {"Cable", Kitchen, "WiFi"}                3

相关问题 更多 >

    热门问题