尝试基于另一个datafrme中的列修改列值时出现Keyerror

2024-09-24 02:25:58 发布

您现在位置:Python中文网/ 问答频道 /正文

我有两个数据帧(df1和df2)

df1

address     mon    tue    wed    ...
address1     40     40     40    ...
address2     20     20     20    ...
address3     30     30      0    ...
address3      0      0     30    ...
...         ...    ...    ...    ...

df2

address     mon    tue    wed    ...
address1      0     15      0    ...
address2      0      6      0    ...
address3     15      0      0    ...
...         ...    ...    ...    ...

我想做的是,当df1(例如mon)列中的值大于0时,如果df2中的值也大于0,则用df2的值替换df1的值:

df1修改

address     mon    tue    wed    ...
address1     40     15     40    ...
address2     20      6     20    ...
address3     15     30      0    ...
address3      0      0     30    ...
...         ...    ...    ...    ...

我正在尝试以下代码based on this

for index, _ in df1.iterrows():
    if df1.loc[index, 'mon'] > 0:
        df1.loc[index, 'mon'] = float(
            df2.loc[(df2['address'] == df1[index, 'address']), 'mon'])

但是我得到一个KeyError: (4, 'address')

Traceback (most recent call last):
  File "/usr/lib64/python3.8/site-packages/pandas/core/indexes/base.py", line 3361, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas/_libs/index.pyx", line 76, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: (4, 'address')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/mnt/project/script.py", line 78, in <module>
    df2.loc[(df2['address'] == df1[index, 'address']), 'mon'])
  File "/usr/lib64/python3.8/site-packages/pandas/core/frame.py", line 3458, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/usr/lib64/python3.8/site-packages/pandas/core/indexes/base.py", line 3363, in get_loc
    raise KeyError(key) from err
KeyError: (4, 'address')

我可能做错了什么

提前谢谢


Tags: inpypandasgetindexaddresslineloc
1条回答
网友
1楼 · 发布于 2024-09-24 02:25:58

使用maskcombine_first

address列设置为两个数据帧的索引,然后创建一个布尔掩码,其中df1和df2值大于0。使用mask将每个与条件匹配的单元格设置为NaN,并使用combine_first将df1的NaN值填充为df2的值

df1 = df1.set_index('address')
df2 = df2.set_index('address').reindex(df1.index)
mask = df1.gt(0) & df2.gt(0)
df1 = df1.mask(mask).combine_first(df2).reset_index()

输出:

>>> df1
    address   mon   tue  wed
0  address1  40.0  15.0   40
1  address2  20.0   6.0   20
2  address3  15.0  30.0    0
3  address3   0.0   0.0   30

相关问题 更多 >