我有两个CSV文件,具有以下模式:
CSV1列:
"Id","First","Last","Email","Company"
CSV2列:
"PersonId","FirstName","LastName","Em","FavoriteFood"
如果我将它们分别加载到Pandas数据帧中并执行newdf = df1.merge(df2, how='outer', left_on=['Last', 'First'], right_on=['LastName','FirstName'])
然后,联接数据帧的CSV导出具有以下架构:
"Id","First","Last","Email","Company","PersonId","FirstName","LastName","Em","FavoriteFood"
我想要的是更像这样的输出模式:
"Id","First","Last","Email","Company","PersonId","Em","FavoriteFood"
我所熟悉的大多数关系数据库软件都是这样的(左边的join列名赢得了命名战)。熊猫有语法来指示它这样做吗
我可以做df1.merge(df2.rename(columns = {'LastName':'Last', 'FirstName':'First'}), how='outer', on=['Last', 'First'])
,但从风格上讲,在源代码中硬编码两次相同的列名会让我发疯。如果我更改CSV文件中的列名,就更难修复了
一种方法是以相同的方式合并,但删除要删除的列
相关问题 更多 >
编程相关推荐