条件复制一列的元素,并应用于Python dataframe中所有满足条件的行

2024-06-15 04:43:36 发布

您现在位置:Python中文网/ 问答频道 /正文

这很简单,不确定我遗漏了什么。 我必须复制'primary_fruit'的'code_num',并根据'fruit_list'将其应用于所有引用行。 如果df[FROUT]不在FROUT\u列表中,则复制自我“code\u num”

fruit_list= ["apple","banana","cherry"]
primary_fruit = 'banana'

print(df)

Store code_num     fruits
A     101          apple  
A     102          cherry 
A     103          cherry 
A     104          banana 
A     105          cherry 
A     106          rambo  
B     201          cherry
B     202          banana
B     203          toy  

预期数据帧:

Store code_num    fruits      reference
A     101          apple       104
A     102          cherry      104
A     103          cherry      104
A     104          banana      104
A     105          cherry      104
A     106          rambo       106
B     201          cherry      202
B     202          banana      202
B     203          toy         203

我编写了以下代码,但输出不正确:

s = df['fruits'].isin(fruit_list)
df.loc[s,'reference'] = df.groupby([s,'Store'])['code_num'].transform('max')


Tags: storeappledfcodenumlistbananacherry
2条回答

您可以尝试:

primary_code = df.query('fruits == @primary_fruit')['code_num'].values[0]

df['reference'] = df['code_num'].where(~df['fruits'].isin(fruit_list), primary_code)

更新以包括Store

primary_codes = (
    df
    .set_index('Store')
    .query('fruits == @primary_fruit')['code_num']
    .to_dict()
    )

df['reference'] = df.apply(lambda x: 
                     x['code_num'] 
                     if x['fruits'] not in fruit_list 
                     else primary_codes.get(x['Store']),
                   axis=1)

核对

df['refer']=df['code_num']


df.loc[df.fruits.isin(fruit_list),'refer']=df.loc[df.fruits.eq(primary_fruit),'refer'].iloc[0]

df
Out[26]: 
  Store  code_num  fruits  refer
0     A       101   apple    104
1     A       102  cherry    104
2     A       103  cherry    104
3     A       104  banana    104
4     A       105  cherry    104
5     A       106   rambo    106

更新我们需要Categorical并重新排序数据帧,而不是transform

d=dict.fromkeys( fruit_list, primary_fruit)
newdf = df.iloc[pd.Categorical(df.fruits, ["banana","apple","cherry"]).argsort()]
df['ref']=newdf.groupby([newdf.Store, newdf.fruits.replace(d) ])['code_num'].transform('first')
df
Out[46]: 
  Store  code_num  fruits  ref
0     A       101   apple  104
1     A       102  cherry  104
2     A       103  cherry  104
3     A       104  banana  104
4     A       105  cherry  104
5     A       106   rambo  106
6     B       201  cherry  202
7     B       202  banana  202
8     B       203     toy  203

相关问题 更多 >