从数据框列中的列表中查找元素（列的类型为列表）

The List : list_m = ['KathyConWom', 'monkeyhead78', 'acorncarver', 'bonglez', '9NewsQueensland', 'paulinedaniels', 'AdvoBarryRoux', '_sara_jade_', 'theage', 'gaskell_mike', 'saidtarraf', 'BroHilderchump', 'jodyvance', 'COdendahl', 'pfizer', 'RobertKennedyJr', 'Real_Sheezzii', 'Kellie_Martin', 'ThatsOurWaldo', 'SCN_Nkosi', 'azsweetheart013']

user_id text tweet_id user_name mention 22 1334471712528855040 @KathyConWom @JamesDelingpole Time to stand-up... 1362119551375314948 @KYourrights [KathyConWom, JamesDelingpole] 23 334131548 @KathyConWom @Exp_Sec_Prof It seems like weste... 1362096715877212161 @GowTolson [KathyConWom, Exp] 24 1252182507715526657 @KathyConWom I guess that the hard part would ... 1362096654514552837 @Peterpu52451065 [KathyConWom]

user_id text tweet_id user_name mention new_col 22 1334471712528855040 @KathyConWom @JamesDelingpole Time to stand-up... 1362119551375314948 @KYourrights [KathyConWom, JamesDelingpole] KathyConWom 23 334131548 @KathyConWom @Exp_Sec_Prof It seems like weste... 1362096715877212161 @GowTolson [KathyConWom, Exp] KathyConWom 24 1252182507715526657 @KathyConWom I guess that the hard part would ... 1362096654514552837 @Peterpu52451065 [azsweetheart013] azsweetheart013

3条回答

网友

1楼 · 编辑于 2024-05-09 23:18:46

您还可以使用^{}以列表格式获取唯一的交叉点，如下所示：

import numpy as np

df['new_col'] = df['mention'].map(lambda x: np.intersect1d(x, list_m))

如果要将列表转换为逗号分隔的字符串，只需将其与^{}链接，如下所示：

import numpy as np

df['new_col'] = df['mention'].map(lambda x: np.intersect1d(x, list_m)).str.join(', ')

您也可以在^{}中简单地使用列表理解，如下所示：

df['new_col'] = df['mention'].apply(lambda x: [y for y in x if y in list_m]).str.join(', ')

网友

2楼 · 编辑于 2024-05-09 23:18:46

您可以使用set的intersection操作来查找两个列表的公共部分

df['new_col'] = df['mention'].apply(lambda mentions: list(set(mentions).intersection(list_m)))

要将列表转换为字符串，可以使用

df['new_col'] = df['mention'].apply(lambda mentions: ', '.join(set(mentions).intersection(list_m)))

网友

3楼 · 编辑于 2024-05-09 23:18:46

试试这个

def add(x):                                                            
    ret = ''                                                           
    for y in x:                           
        if y in list_m:
            if len(ret) > 0:
                ret += ','
            ret += y
    return ret
    
df['new_col'] = df['mention'].apply(lambda x: add(x))

相关问题更多 >

编程相关推荐

热门问题

热门文章