如何将不同数据帧中选定的列相乘

2024-10-03 17:23:21 发布

您现在位置:Python中文网/ 问答频道 /正文

我有3个熊猫数据帧(类似于下面的一个)。我有两个列表list ID_1 = ['sdf', 'sdfsdf', ...]list ID_2 = ['kjdf', 'kldfjs', ...]

Table1:
    ID_1    ID_2    Value
0   PUFPaY9 NdYWqAJ 0.002
1   Iu6AxdB qANhGcw 0.01
2   auESFwW jUEUNdw 0.2345
3   LWbYpca G3uZ_Rg 0.0835
4   8fApIAM mVHrayg 0.0295

Table2:
     ID_1    weight1 weight2 .....weightN
0   PUFPaY9     
1   Iu6AxdB     
2   auESFwW 
3   LWbYpca     

Table3:
    ID_2    weight1 weight2 .....weightN
0   PUFPaY9     
1   Iu6AxdB     
2   auESFwW     
3   LWbYpca     

我想要一个数据帧,它的计算应该是

for each x ID_1 in list1:
    for each y ID_2 in list2:
        if x-y exist in Table1:
            temp_row = ( x[weights[i]].* y[weights[i]])
            # here i want one to one multiplication, x[weight1]*y[weight1] , x[weight2]*y[weight2]
            temp_row.append(value[x-y] in Table1)
            new_dataframe.append(temp_row)

return new_dataframe

所需的新数据帧应如表4所示:

Table4:
        weight1 weight2 weight3 .....weightN value
    0           
    1           
    2       
    3       

我现在能做的是:

new_df = df[(df.ID_1.isin(list1)) & (df.ID_2.isin(list2))] 使用这个我得到所有有效的ID_1ID_2组合和值。但是我不知道,如何从两个datafame获得权重的乘法(不为每个weight[i]循环)?你知道吗

现在任务更简单了,我可以迭代new_dffor each row in new_df,我会找到weight[i to n] for ID_1 from table 2weight[i to n] for ID_2 from table3。然后我可以将它们的one-one multiplication"value" from table1附加到新的FINAL_DF。但我不想循环,我们能用更聪明的方法解决这个问题吗?你知道吗


Tags: 数据iniddfnewforonerow
1条回答
网友
1楼 · 发布于 2024-10-03 17:23:21

这就是你想要的吗?你知道吗

data = """\
ID_1
PUFPaY9     
aaaaaaa
Iu6AxdB     
auESFwW 
LWbYpca
"""
id1 = pd.read_csv(io.StringIO(data), delim_whitespace=True)

data = """\
ID_2   
PUFPaY9
Iu6AxdB
xxxxxxx
auESFwW
LWbYpca
"""
id2 = pd.read_csv(io.StringIO(data), delim_whitespace=True)

cols = ['weight{}'.format(i) for i in range(1,5)]
for c in cols:
    id1[c] = np.random.randint(1, 10, len(id1))
    id2[c] = np.random.randint(1, 10, len(id2))

id1.set_index('ID_1', inplace=True)
id2.set_index('ID_2', inplace=True)

df_mul = id1 * id2

循序渐进:

In [215]: id1
Out[215]:
         weight1  weight2  weight3  weight4
ID_1
PUFPaY9        8        9        1        1
aaaaaaa        6        1        9        2
Iu6AxdB        8        4        8        5
auESFwW        9        3        4        2
LWbYpca        7        7        1        8

In [216]: id2
Out[216]:
         weight1  weight2  weight3  weight4
ID_2
PUFPaY9        6        5        5        1
Iu6AxdB        1        5        4        5
xxxxxxx        1        2        6        4
auESFwW        3        9        5        5
LWbYpca        3        3        6        7

In [217]: id1 * id2
Out[217]:
         weight1  weight2  weight3  weight4
Iu6AxdB      8.0     20.0     32.0     25.0
LWbYpca     21.0     21.0      6.0     56.0
PUFPaY9     48.0     45.0      5.0      1.0
aaaaaaa      NaN      NaN      NaN      NaN
auESFwW     27.0     27.0     20.0     10.0
xxxxxxx      NaN      NaN      NaN      NaN

相关问题 更多 >