为合并添加重复键的计数

2024-10-02 10:32:30 发布

您现在位置:Python中文网/ 问答频道 /正文

我为合并创建了一个键。不幸的是,有一些重复的密钥。但我需要保留这些行。我在想,对于每一组重复的键,我可以把计数数字1、2、3等等加到每一个重复的键上,使它们唯一

你能推荐一个命令或方法来做这个吗?非常感谢

这些实际上是我一直在想的部分之前的代码:

#creating a key variable for merging
df['dfkey'] = df['ColA'].map(str) + ' ' + df['ColB'].map(str) + ' ' + df['ColC'].map(str)    #creating the key
df['dfkeycount'] = df.groupby('dfkey')['dfkey'].transform('count')                           #counting the freq of each dfkey ---> to know if they are unique
df['dfkeycountcat'] = df.groupby(['dfkey','Category'])['dfkey'].transform('count')           #to count the freq of each dfkey per Category Note: Later, will divide the dataset into Category. Then will merge them side by side (one variable will be renamed based on the category name).

dataunique = df.loc[df['dfkeycountcat'] == 1]                                                #created this subset for those with clean keys. I am actually successful with the merging if only within this dataset.
dataduplicate = df.loc[df['dfkeycountcat'] > 1]                                              #this is the dataset that I want to apply the code for adding a sequence number at the end of the key.                                             

Tags: ofthetokeymapdfforcount
1条回答
网友
1楼 · 发布于 2024-10-02 10:32:30

非常感谢回复的人。能够使用cumcount

df['dfkeynew'] = df['dfkey'].map(str) + df.groupby('dfkey').cumcount().map(str)
df['dfkeycountnew'] = df.groupby('dfkeynew')['dfkeynew'].transform('count')   

df['dfkeycountnew'].value_counts()

它们现在都是独一无二的

相关问题 更多 >

    热门问题