在数据帧中追加字符串:ValueError:无法从重复轴重新索引

2024-09-27 00:14:21 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个数据框看起来与此类似(除了更长的颜色名称):

ff = pd.DataFrame({'OldCol':['darkbrown','lightbeige','lightbrown / beige','beige','brown','beige / cognac'], 'NewCol':['nan','nan','nan','nan','nan','nan']})

我想要那个数据框看起来像这样:

ffnew = pd.DataFrame({'OldCol':['darkbrown','lightbeige','lightbrown / beige','beige','brown','beige / cognac'], 'NewCol':['brown','beige','beige / brown','sand','brown','sand / brown']})

我尝试了以下方法:

ff.loc[ff['OldCol'].str.contains(r'brown|cognac',na=False) & ff['NewCol'].str.contains(r'nan'), 'NewCol'] = 'brown'
ff.loc[ff['OldCol'].str.contains(r'brown|cognac',na=False) & ~ff['NewCol'].str.contains(r'nan|brown'), 'NewCol'] = ff['NewCol']+'/ brown'

ff.loc[ff['OldCol'].str.contains(r'beige|sand',na=False) & ff['NewCol'].str.contains(r'nan'), 'NewCol'] = 'beige'
ff.loc[ff['OldCol'].str.contains(r'beige|sand',na=False) & ~ff['NewCol'].str.contains(r'nan|beige'), 'NewCol'] = ff['NewCol'] +'/ beige'

在我的生活中数据框我通常会得到一个错误:

ValueError: cannot reindex from a duplicate axis

有人能帮忙吗? 非常感谢!你知道吗


Tags: 数据falsenanlocpdffnacontains
1条回答
网友
1楼 · 发布于 2024-09-27 00:14:21

index中的重复项有问题。您可以将索引的所有值替换为^{}Regular Index0,1,2..len(df)-1)。旧值由参数drop=True删除:

ff.reset_index(drop=True, inplace=True)

测试:

ff = pd.DataFrame({'OldCol':['darkbrown','lightbeige','lightbrown / beige','beige','brown','beige / cognac'], 'NewCol':['nan','nan','nan','nan','nan','nan']})
ffnew = pd.DataFrame({'OldCol':['darkbrown','lightbeige','lightbrown / beige','beige','brown','beige / cognac'], 'NewCol':['brown','beige','beige / brown','sand','brown','sand / brown']})
ff.index = [0,0,2,3,4,5]
#ValueError: cannot reindex from a duplicate axis
ff.reset_index(drop=True, inplace=True)

相关问题 更多 >

    热门问题