根据特定条件将pandas中的两个字符串列组合成一个新列的最佳方法是什么?

2024-09-20 03:53:52 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个在每列中都有字符串值的数据帧。我想把第1列和第2列合并成一个新的列,比如说第4列。但是,如果第1列和第2列中的单词相同,我想将第1列和第3列合并到新的列中。你知道吗

我试着先把一对放在一个列表中,然后把它作为一个单独的列,但是没有成功。我是python新手,所以我想我缺少了一个更简单的解决方案。你知道吗

pairs = []
for row in df['interest1']:
    if row == df['interest2'].iloc[row]:
        pairs.append(df['interest1'] + ' ' + df['interest2'])
    else:
        pairs.append(df['interest1'] + ' ' + df['interest3'])
#a simple example of what I would like to achieve

import pandas as pd

lst= [['music','music','film','music film'],
      ['guitar','piano','violin','guitar piano'],
      ['music','photography','photography','music photography'],
     ]

df= pd.DataFrame(lst,columns=['interest1','interest2','interest3','first distinct pair'])
df

Tags: 字符串dfmusicrowpdfilmappendlst
1条回答
网友
1楼 · 发布于 2024-09-20 03:53:52

您可以对数据帧使用where方法

df['first_distinct_pair'] = (df['interest1'] + df['interest2']).where(df['interest1'] != df['interest2'],  df['interest1'] + df['interest3'])

如果要包含空格,可以执行以下操作:

df['first_distinct_pair'] = (df['interest1'] + ' '+ df['interest2']).where(df['interest1'] != df['interest2'],  df['interest1'] + ' ' + df['interest3'])

结果是:

 import pandas as pd
      ...: 
      ...: lst= [['music','music','film'],
      ...:       ['guitar','piano','violin'],
      ...:       ['music','photography','photography'],
      ...:      ]
      ...: 
      ...: df= pd.DataFrame(lst,columns=['interest1','interest2','interest3'])

>>> df['first_distinct_pair'] = (df['interest1'] + ' '+ df['interest2']).where(df['interest1'] != df['interest2'],  df['interest1'] + ' ' + df['interest3'])

>>> df
  interest1    interest2    interest3 first_distinct_pair
0     music        music         film          music film
1    guitar        piano       violin        guitar piano
2     music  photography  photography   music photography

相关问题 更多 >