用另一个数据帧替换一个数据帧中的行

2024-06-14 05:29:57 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图用另一个数据帧替换一个数据帧中的行的值。你知道吗

下面是示例代码

import pandas as pd
import numpy as np
from pprint import pprint

raceA = ['r1','r3','r4','r5','r6','r7','r8', 'r9']
qualifierA = ['last','first','first','first','last','last','first','first']
participantA = ['rat','rat','cat','cat','rat','dog','dog','dog']
dfA = pd.DataFrame(
    {'race':raceA,
     'qualifier':qualifierA,
     'participant':participantA

    }
)
pprint(dfA)

raceB = ['r1','r2','r3','r4','r5','r6','r7','r8', 'r9','r10']
qualifierB = ['last',np.nan,np.nan,'first','first','last','last','first','first',np.nan]
participantB = ['rat','rat',np.nan,'cat','cat','rat','dog','dog',np.nan,np.nan]
dfB = pd.DataFrame(
    {'race':raceB,
     'qualifier':qualifierB,
     'participant':participantB

    }
)
pprint(dfB)
dfB.loc[dfB.race.isin(dfA.race), ['qualifier','participant']] = dfA[['qualifier','participant']]
pprint(dfB)

例如在dfA中

r9     first         dog

dfB包含

 r9     first         NaN

期望输出: dfB公司

r9     first         dog

获得的输出:

r9       NaN         NaN

有人能调查一下吗?你知道吗


Tags: importnpnancatfirstpprintlastdog
3条回答

我会分多个步骤做这样的事情。你知道吗

首先我将合并两个数据帧-

dfB_PreProcessing = dfB.merge(dfA,left_on='race',right_on='race',how="left")

enter image description here 然后清洁参与者栏-

dfB_PreProcessing['participant_x'] = dfB_PreProcessing['participant_x'] .replace(np.nan, '', regex=True)
dfB_PreProcessing['participant'] = np.where(dfB_PreProcessing['participant_x'] == '', dfB_PreProcessing['participant_y'], dfB_PreProcessing['participant_x'])

然后清除限定符列(如果需要)——

dfB_PreProcessing['qualifier_x'] = dfB_PreProcessing['qualifier_x'] .replace(np.nan, '', regex=True)
dfB_PreProcessing['qualifier'] = np.where(dfB_PreProcessing['qualifier_x'] == '', dfB_PreProcessing['qualifier_y'], dfB_PreProcessing['qualifier_x'])*

然后只选择所需的列作为输出df-

dfB = dfB_PreProcessing.loc[:,['race','qualifier','participant']]

enter image description here

让我知道,如果有效或无效

^{}与数据帧一起用作:

df = dfB.set_index('race').fillna(dfA.set_index('race')).reset_index()

print(df)
  race qualifier participant
0   r1      last         rat
1   r2       NaN         rat
2   r3     first         rat
3   r4     first         cat
4   r5     first         cat
5   r6      last         rat
6   r7      last         dog
7   r8     first         dog
8   r9     first         dog
9  r10       NaN         NaN

或使用update

dfB = dfB.set_index('race')
dfA = dfA.set_index('race')

dfB.update(dfA)

print(dfB.reset_index())
 race qualifier participant
0   r1      last         rat
1   r2       NaN         rat
2   r3     first         rat
3   r4     first         cat
4   r5     first         cat
5   r6      last         rat
6   r7      last         dog
7   r8     first         dog
8   r9     first         dog
9  r10       NaN         NaN

如果我拿不好就纠正我。 如果要更新一行或多列,则可以更新该列的特定索引的值。 如。 如果我想更新B列中的所有行,那么

df = pd.DataFrame({'A':[1,2,3],'B': [4,5,6]})
df1 = pd.DataFrame({'B':[7,8,9]})
df.update(df1)
pprint(df)

相关问题 更多 >