我有一个数据框看起来与此类似(除了更长的颜色名称):
ff = pd.DataFrame({'OldCol':['darkbrown','lightbeige','lightbrown / beige','beige','brown','beige / cognac'], 'NewCol':['nan','nan','nan','nan','nan','nan']})
我想要那个数据框看起来像这样:
ffnew = pd.DataFrame({'OldCol':['darkbrown','lightbeige','lightbrown / beige','beige','brown','beige / cognac'], 'NewCol':['brown','beige','beige / brown','sand','brown','sand / brown']})
我尝试了以下方法:
ff.loc[ff['OldCol'].str.contains(r'brown|cognac',na=False) & ff['NewCol'].str.contains(r'nan'), 'NewCol'] = 'brown'
ff.loc[ff['OldCol'].str.contains(r'brown|cognac',na=False) & ~ff['NewCol'].str.contains(r'nan|brown'), 'NewCol'] = ff['NewCol']+'/ brown'
ff.loc[ff['OldCol'].str.contains(r'beige|sand',na=False) & ff['NewCol'].str.contains(r'nan'), 'NewCol'] = 'beige'
ff.loc[ff['OldCol'].str.contains(r'beige|sand',na=False) & ~ff['NewCol'].str.contains(r'nan|beige'), 'NewCol'] = ff['NewCol'] +'/ beige'
在我的生活中数据框我通常会得到一个错误:
ValueError: cannot reindex from a duplicate axis
有人能帮忙吗? 非常感谢!你知道吗
index
中的重复项有问题。您可以将索引的所有值替换为^{Regular Index
(0,1,2..len(df)-1
)。旧值由参数drop=True
删除:测试:
相关问题 更多 >
编程相关推荐