Pandas：统计单词的出现次数（来自另一个数据帧），并输出计数和匹配的单词

import pandas as pd import re df = pd.DataFrame({'sentence': ['Hello how are you', 'It is nice outside today', 'I need to water the plants', 'I need to cook dinner', 'See you tommorow']}) print(df) df2 = pd.DataFrame({'words': ['hello', 'you', 'plants', 'need', 'tommorow']}) print(df2) df["count"] = df["sentence"].str.count('|'.join(df2['words']), re.I) print(df) df_desiredoutput = pd.DataFrame({'sentence': ['Hello, how are you?', 'It is nice outside today', 'I need to water the plants', 'I need to cook dinner', 'See you tommorow'], 'count': ['2', '0', '2', '1', '2'], 'match': ['hello; you', '', 'need; plants', 'need', 'you; tomorrow']}) print(df_desiredoutput)

1条回答

网友

1楼 · 发布于 2024-04-19 03:15:54

将^{}与^{}一起使用：

pat = '|'.join(df2['words'])
df["count"] = df["sentence"].str.count(pat, re.I)
df["match"] = df["sentence"].str.findall(pat, re.I).str.join('; ')
print(df)
                     sentence  count          match
0           Hello how are you      2     Hello; you
1    It is nice outside today      0               
2  I need to water the plants      2   need; plants
3       I need to cook dinner      1           need
4            See you tommorow      2  you; tommorow

相关问题更多 >

编程相关推荐

热门问题

热门文章