python中与字典键匹配的字符串

2024-04-25 15:13:24 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个字符串列表和一本字典。例如:

list = ["apple fell on Newton", "lemon is yellow","grass is greener"]
dict = {"apple" : "fruits", "lemon" : "vegetable"}

任务是将列表中的每个字符串与字典的键相匹配。如果匹配,则返回键的值

目前,我正在使用这种非常耗时的方法。有人能帮我找到一些有效的方法吗

lmb_extract_type = (lambda post: list(filter(None, set(dict.get(w)[0] if w in post.lower().split() else None for w in dict))))

 df['type']  = df[list].apply(lmb_extract_type)

Tags: 方法字符串innoneapple列表字典is
1条回答
网友
1楼 · 发布于 2024-04-25 15:13:24

It is a single column with a string (eg.: "apple fell on Newton") in each row of the data frame. For each row, I have to match it with key from the dictionary and return value of the corresponding key

Number of elements in the list is around 40-50 million.So, its taking a lot of time

IIUC,根据您的评论,您可以使用str.extractseries.replace轻松解决这个问题,这两个函数都是矢量化函数,没有任何循环

  1. 对于使用str.extract,可以从字典的键创建正则表达式模式。这只提取关键词苹果或柠檬
  2. 您可以使用字典d直接用相应的值替换其中的每一个
l = ["apple fell on Newton", "lemon is yellow","grass is greener"]
d = {"apple" : "fruits", "lemon" : "vegetable"}

df = pd.DataFrame(l, columns=['sentences']) #Single column dataframe to demonstrate.

pattern = '('+'|'.join(d.keys())+')'   #Regular expression pattern
df['type'] = df.sentences.str.extract(pattern).replace(d)
print(df)
              sentences       type
0  apple fell on Newton     fruits
1       lemon is yellow  vegetable
2      grass is greener        NaN

相关问题 更多 >