从我前面的问题Is there any method to match tabular list with pivot list format?中,我找到了使用DataFrames.melt方法获得结果的方法。但是,我发现我在excel中的数据有重复的标题名称(颜色),pandas会自动添加.1
,我想检查是否有其他方法不需要标题为相同的名称来匹配它
匹配的标准是,如果颜色-粉色、黄色、蓝色、红色与M列表的列匹配,则获取代码号和品牌号
详情如下:
df = pd.DataFrame({'M': ['M1','M2','M3','M4','M5','M6','M7','M8','M9','M10'],
'Color' : ['pink','yellow','yellow','yellow','blue','pink','pink','pink','red','blue']})
df1 = pd.DataFrame({'M_list_1': ['S3','M73','M18','M12','M1','M2','M7','S5','S17','S18','S20','M78','Z4','M6','S8','M10','M99','S9'],
'M_list_2': ['M3','Z73','Z18','Z12','K1','S2','Z7','M5','M17','M18','M20','M78','M4','Z6','M8','S10','M99','M9'],
'pink': [23255,1755,7897,6666,1311,78946,44893,148791,18783,487913,79815,88888,149643,154894,165496,1351979,15648,5222],
'yellow': [1000,99999,78945,15543,10000,20000,14613,77778,45139,11776,78,1231,789613,10009,70009,88009,13036,789746],
'blue': [2000,8000,9000,6000,104560,204890,146103,70078,40009,11006,78789,13531,71113,10809,70999,889,36,7846],
'red': [300000,82200,90,60330,11560,21120,14783,1578,31109,91006,9719,9131,1113,1009,799,871,10116,1146],
'pink.1': ["Brand9","Brand9","Brand9","Brand13","Brand77","-","Brand1","Brand1"
,"Brand1","Brand51","Brand71","Brand16","Brand96","Brand15","Brand61","-","Brand13","-"],
'yellow.1': ["Brand9","-","-","-","Brand17","Brand61","Brand13","-"
,"Brand8","Brand9","Brand1","-","Brand77","-","Brand1","-","Brand11","Brand1"],
'blue.1': ["-","Brand13","Brand9","-","Brand1","Brand1","-","-"
,"Brand1","Brand16","Brand1","-","Brand9","Brand1","Brand1","-","-","-"],
'red.1': ["Brand8","Brand91","Brand9","Brand13","Brand59","Brand1","Brand1","Brand1"
,"Brand2","-","Brand1","-","Brand9","Brand77","Brand1","Brand19","-","Brand2"]})
我的期望结果如下:
M Color Code Brand
0 M1 pink 1311 Brand77
1 M2 yellow 20000 Brand61
2 M3 yellow 1000 Brand9
3 M4 yellow 789613 Brand77
4 M5 blue 70078 -
5 M6 pink 154894 Brand15
6 M7 pink 44893 Brand1
7 M8 pink 165496 Brand61
8 M9 red 1146 Brand2
9 M10 blue 889 -
从您的
DataFrames
,我们构建df_full
:我们首先从} 构建
Color
列获取pink
、yellow
、blue
和red
列中的值,以使用^{Code
列,如下所示:然后我们对一个新列
Color1
执行相同的操作,它允许我们构建Brand
并获得预期的结果:编辑:
正如@bbbbbb所评论的,} :
lookup
现在产生一个未来警告,因为它将被贬值。使用pandas>=1.1.0
我们必须使用^{与
Brand
相同:此解决方案首先由所有列使用} 和^{} 重新设置值,以便在最后一步使用
split
,而不首先由.
使用M
,因此需要在MultiIndex
中替换NaN
,然后由^{merge
:相关问题 更多 >
编程相关推荐