我有两个数据帧,每个数据帧有300万条记录。我需要比较并显示每个匹配索引、发生不匹配的列名以及这些不匹配列中的数据
下面是dataframe和预期输出的示例:
df1=pd.DataFrame({
'Date':['2013-11-23','2013-11-24','2013-11-25','2013-11-26','2013-11-27'],
'Fruit':['Banana','Orange','Apple','Celery','Apple'],
'Num':[10.2,22.1,8.6,7.6,10.2],
'Color':['Green','Yellow','Orange','Green','Green']
})
df2=pd.DataFrame({
'Date':['2013-11-23','2013-11-24','2013-11-25','2013-11-26','2013-11-27'],
'Fruit':['Banana1','Orange','Apple','Celery','Apple'],
'Num':[10.2,22.12,8.60,7.6,10.2],
'Color':['Green','Yellow1','Orange','Green','Green']
})
df1.set_index("Date",inplace = True)
df2.set_index("Date",inplace = True)
Dataframe 1:
Fruit Num Color
Date
2013-11-23 Banana 10.2 Green
2013-11-24 Orange 22.1 Yellow
2013-11-25 Apple 8.6 Orange
2013-11-26 Celery 7.6 Green
2013-11-27 Apple 10.2 Green
DataFrame 2:
Fruit Num Color
Date
2013-11-23 Banana1 10.21 Green
2013-11-24 Orange 22.12 Yellow1
2013-11-25 Apple 8.60 Orange
2013-11-26 Celery 7.60 Green
2013-11-27 Apple 10.20 Green
Expected OUTPUT:
Date 2013-11-23 has mismatch in columns ['Fruit' , 'Num'].
DF1: ['Banana',10.2]
DF2: ['Banana1', 10.21]
--------------------------------------------------------
Date 2013-11-24 has mismatch in columns ['Num' , 'Color'].
DF1: [22.1,'Yellow']
DF2: [22.12,'Yellow1']
----------------------------------------------------------
And so on for every Date index
将文件另存为.py并运行它
输出
相关问题 更多 >
编程相关推荐