合并相同数据类型列上的两个数据帧,但获取值

2024-06-28 10:50:23 发布

您现在位置:Python中文网/ 问答频道 /正文

我想合并Income df中的profile_ID列和CompProfile df中的index CommodityClass上的两个数据帧。在

收入

          profile_ID   type          col1        col2
0         O-COMP-1006  Small_Off   4.1427e+07   4.0027e+07
1         O-COMP-1006  Small_Off   4.7915e+07   4.6515e+07
2         O-COMP-1006  Small_Off  6.10424e+07  5.96424e+07
3         O-COMP-1006  Small_Off  6.83726e+07  6.69726e+07
4         O-COMP-1008  Small_Off  7.28167e+07  7.14167e+07
5         O-COMP-1009  Small_Off   7.6147e+07   7.4747e+07
7         O-COMP-1006  Small_Off  8.02798e+07  7.88798e+07
8         O-COMP-1006  Small_Off  8.17172e+07  8.03172e+07
9         O-COMP-1006  Small_Off  8.42322e+07  8.28322e+07
10        O-COMP-1005  Small_Off  8.54957e+07  8.40747e+07
11        O-COMP-1006  Small_Off  8.67782e+07  8.53358e+07
12        O-COMP-1006  Small_Off  8.80798e+07  8.66159e+07
13        O-COMP-1007  Small_Off   8.9401e+07  8.79151e+07
14        O-COMP-1006  Small_Off   9.0742e+07  8.92338e+07

和公司简介

^{pr2}$

我使用

pd.merge( Income, CompProfile, how='left', \
    left_on = 'profile_ID', right_index=True, \
    suffixes = ("_USD","_frac") )

然后得到一个错误

ValueError: You are trying to merge on object and int64 columns. If you wish to proceed you should use pd.concat

我检查了要合并的列和索引,它们的类型是object。 我试着使用join,但得到了同样的错误。在

Income.join(CompProfile, on= 'profile_ID',lsuffix = "_USD",rsuffix = "_frac")

我还尝试在列上重置CompProfile和merge的索引:

CompProfile.reset_index()
pd.merge( Income, CompProfile, how='left', \
    left_on = 'profile_ID', right_on='CommodityClass', \
    suffixes = ("_USD","_frac") )

在这种情况下我得到

KeyError: 'CommodityClass'

我还试着从CompProfile中删除“profile_ID”行,但它没有改变任何东西。在

CompProfile.head(10).to_dict()
{'col1': {'profile_ID': nan, 'O-COMP-1001': 0.0, 'O-COMP-1002': 0.0, 'O-COMP-1003': 0.0, 'O-COMP-1004': 0.0, 'O-COMP-1005': 0.0, 'O-COMP-1006': 1.0, 'O-COMP-1007': 0.0, 'O-COMP-1008': 0.0, 'O-COMP-1009': 0.0}, 'col2': {'profile_ID': nan, 'O-COMP-1001': 0.0, 'O-COMP-1002': 0.0, 'O-COMP-1003': 0.0, 'O-COMP-1004': 0.0, 'O-COMP-1005': 0.0, 'O-COMP-1006': 0.0, 'O-COMP-1007': 0.0, 'O-COMP-1008': 0.0, 'O-COMP-1009': 1.0}, 'col3': {'profile_ID': nan, 'O-COMP-1001': 0.0, 'O-COMP-1002': 0.0, 'O-COMP-1003': 0.0, 'O-COMP-1004': 0.0, 'O-COMP-1005': 0.0, 'O-COMP-1006': 0.0, 'O-COMP-1007': 1.0, 'O-COMP-1008': 0.0, 'O-COMP-1009': 0.0}}

Tags: idindexonmergeprofileleftsmallpd
1条回答
网友
1楼 · 发布于 2024-06-28 10:50:23

你的第一次尝试很好。下面是一个完整的工作示例。您需要后退一步,并弄清楚为什么您的数据与下面提供的示例数据不同。在

import pandas as pd
from numpy import nan

d1 = {'profile_ID': {0: 'O-COMP-1006', 1: 'O-COMP-1006', 2: 'O-COMP-1006', 3: 'O-COMP-1006', 4: 'O-COMP-1008', 5: 'O-COMP-1009', 7: 'O-COMP-1006', 8: 'O-COMP-1006', 9: 'O-COMP-1006', 10: 'O-COMP-1005'}, 'type': {0: 'Small_Off', 1: 'Small_Off', 2: 'Small_Off', 3: 'Small_Off', 4: 'Small_Off', 5: 'Small_Off', 7: 'Small_Off', 8: 'Small_Off', 9: 'Small_Off', 10: 'Small_Off'}, 'col1': {0: 41427000.0, 1: 47915000.0, 2: 61042400.0, 3: 68372600.0, 4: 72816700.0, 5: 76147000.0, 7: 80279800.0, 8: 81717200.0, 9: 84232200.0, 10: 85495700.0}, 'col2': {0: 40027000.0, 1: 46515000.0, 2: 59642400.0, 3: 66972600.0, 4: 71416700.0, 5: 74747000.0, 7: 78879800.0, 8: 80317200.0, 9: 82832200.0, 10: 84074700.0}}
d2 = {'col1': {'profile_ID': nan, 'O-COMP-1001': 0.0, 'O-COMP-1002': 0.0, 'O-COMP-1003': 0.0, 'O-COMP-1004': 0.0, 'O-COMP-1005': 0.0, 'O-COMP-1006': 1.0, 'O-COMP-1007': 0.0, 'O-COMP-1008': 0.0, 'O-COMP-1009': 0.0}, 'col2': {'profile_ID': nan, 'O-COMP-1001': 0.0, 'O-COMP-1002': 0.0, 'O-COMP-1003': 0.0, 'O-COMP-1004': 0.0, 'O-COMP-1005': 0.0, 'O-COMP-1006': 0.0, 'O-COMP-1007': 0.0, 'O-COMP-1008': 0.0, 'O-COMP-1009': 1.0}, 'col3': {'profile_ID': nan, 'O-COMP-1001': 0.0, 'O-COMP-1002': 0.0, 'O-COMP-1003': 0.0, 'O-COMP-1004': 0.0, 'O-COMP-1005': 0.0, 'O-COMP-1006': 0.0, 'O-COMP-1007': 1.0, 'O-COMP-1008': 0.0, 'O-COMP-1009': 0.0}}

Income = pd.DataFrame.from_dict(d1)
CompProfile = pd.DataFrame.from_dict(d2)

res = pd.merge(Income, CompProfile, how='left',
               left_on='profile_ID', right_index=True,
               suffixes=('_USD', '_frac'))

print(res)

     profile_ID       type    col1_USD    col2_USD  col1_frac  col2_frac  col3
0   O-COMP-1006  Small_Off  41427000.0  40027000.0        1.0        0.0   0.0
1   O-COMP-1006  Small_Off  47915000.0  46515000.0        1.0        0.0   0.0
2   O-COMP-1006  Small_Off  61042400.0  59642400.0        1.0        0.0   0.0
3   O-COMP-1006  Small_Off  68372600.0  66972600.0        1.0        0.0   0.0
4   O-COMP-1008  Small_Off  72816700.0  71416700.0        0.0        0.0   0.0
5   O-COMP-1009  Small_Off  76147000.0  74747000.0        0.0        1.0   0.0
7   O-COMP-1006  Small_Off  80279800.0  78879800.0        1.0        0.0   0.0
8   O-COMP-1006  Small_Off  81717200.0  80317200.0        1.0        0.0   0.0
9   O-COMP-1006  Small_Off  84232200.0  82832200.0        1.0        0.0   0.0
10  O-COMP-1005  Small_Off  85495700.0  84074700.0        0.0        0.0   0.0

相关问题 更多 >