Pandas Duplicate（）一次返回除一行以外的所有重复项

lucky_winners.full_name 62 Marie Curie, née Sklodowska 215 Comité international de la Croix Rouge (Intern... 340 Linus Carl Pauling 348 Comité international de la Croix Rouge (Intern... 424 John Bardeen 505 Frederick Sanger 523 Office of the United Nations High Commissioner...

2条回答

网友

1楼 · 编辑于 2024-09-29 19:34:24

如果要查找具有多个匹配项的所有唯一值，一种方法是使用带有可选return_counts=True参数的^{}。结果元组(unique, counts)可以组合使用，以查找计数超过1的所有唯一值：

In [3]: # mash keys to get a series with repeated values
   ...: s = pd.Series(list('abcoiansfaionawiaonwncawowc'))

In [4]: # get unique values and counts
   ...: u, c = np.unique(s, return_counts=True)

In [5]: # find all unique keys with occurrence counts > 1
   ...: u[c > 1]
Out[5]: array(['a', 'c', 'i', 'n', 'o', 'w'], dtype=object)

网友

2楼 · 编辑于 2024-09-29 19:34:24

因此，我所做的是在不重复的情况下获得所有的副本（先把问题再读一遍）：

已获取具有多个引用的所有重复项
lucky_winners = df[df.duplicated(['full_name'])]
然后从这个新创建的数据帧中删除重复项
lucky_winners.drop_duplicates(subset = ['full_name'], inplace=True)

就这些！通过这种方式，我得到了所有重复的行，没有重复

相关问题更多 >

编程相关推荐

热门问题

热门文章