我想消除代码中的嵌套循环,但我似乎无法找到最好的方法。 我已经在下面解释了我要做的事情:
我有一个数据帧df
data = [['1A', 'apple', '35-44', 'male', ['apple', 'strawberry', 'pineapple']], ['1B', 'banana', '15-24', 'female', ['apple', 'banana', 'durian']], \
['1C', 'cranberry', '35-44', 'male', ['cranberry', 'apple', 'durian']], ['1D','durian', '15-24', 'female', ['durian', 'kiwi', 'banana']], \
['1E', 'elderberry', '35-44', 'male', ['elderberry', 'apple', 'papaya']]]
df = pd.DataFrame(data, columns= ['ID','fav_fruit','age_group', 'gender', 'top3_fruits'])
ID fav_fruit age_group gender top3_fruits
0 1A apple 35-44 male [apple, strawberry, pineapple]
1 1B banana 15-24 female [apple, banana, durian]
2 1C cranberry 35-44 male [cranberry, apple, durian]
3 1D durian 15-24 female [durian, kiwi, banana]
4 1E elderberry 35-44 male [elderberry, apple, papaya]
现在,在这个数据帧中,我想检查并比较每一行与所有其他行的特定条件
如果满足条件,那么我想将匹配行的“ID”和“top3_水果”作为单独的列附加到数据帧df的末尾
这是我用嵌套for循环编写的代码
df_copy = df.copy()
sample_df = pd.DataFrame()
matching_id = []
fruits_to_recommend = []
for i in range(len(df)):
for j in range(len(df)):
if (i!=j) and (df.iloc[i]['fav_fruit'] in df_copy.iloc[j]['top3_fruits']) and \
(df.iloc[i]['gender'] == df_copy.iloc[j]['gender']) and\
(df.iloc[i]['age_group'] == df_copy.iloc[j]['age_group']):
sample_df = sample_df.append(df_copy.iloc[[i]])
matching_id.append(df_copy.iloc[j]['ID'])
fruits_to_recommend.append(df_copy.iloc[j]['top3_fruits'])
sample_df['matching_id'] = matching_id
sample_df['fruits_to_recommend'] = fruits_to_recommend
我正在寻找更可行/更快的选择
首先检查您的3个条件,然后为每一行构建一个包含匹配行的数据帧。最后将其连接回原始df
我的方法是使用^{} 方法和^{} 函数
相关问题 更多 >
编程相关推荐