Python在字符串中从列表中精确搜索单词？

categories_to_retain = ['SOLID', 'GEOMETRIC', 'FLORAL', 'BOTANICAL', 'STRIPES', 'ABSTRACT', 'ANIMAL', 'GRAPHIC PRINT', 'ORIENTAL', 'DAMASK', 'TEXT', 'CHEVRON', 'PLAID', 'PAISLEY', 'SPORTS'] x = " Beautiful Art By Design Studio **graphic print** Creates A **TEXT** Design For This Art Driven Duvet. Printed In Remarkable Detail On A Woven Duvet, This Is An Instant Focal Point Of Any Bedroom. The Fabric Is Woven Of Easy Care Polyester And Backed With A Soft Poly/Cotton Blend Fabric. The Texture In The Fabric Gives Dimension And A Unique Look And Feel To The Duvet." x = x.upper() print x #x = "GRAPHIC" #x = "GRAPHIC PRINTS" matches = [cat for cat in categories_to_retain if cat in x.split()] matches Output: ['TEXT']

3条回答

网友

1楼 · 编辑于 2024-07-07 08:56:38

使用带边界的正则表达式来获得精确匹配，即使只有单个单词，如果试图忽略任何标点符号，则逻辑将不起作用：

import re

patts = re.compile("|".join(r"\b{}\b".format(s) for s in categories_to_retain), re.I)

x = " Beautiful Art By  Design Studio **graphic print** Creates A **TEXT** Design For This Art Driven Duvet. Printed In Remarkable Detail On A Woven Duvet, This Is An Instant Focal Point Of Any Bedroom. The Fabric Is Woven Of Easy Care Polyester And Backed With A Soft Poly/Cotton Blend Fabric. The Texture In The Fabric Gives Dimension And A Unique Look And Feel To The Duvet."

print(patts.findall(x))

这会给你：

^{pr2}$

网友

2楼 · 编辑于 2024-07-07 08:56:38

您可以使用正则表达式，这也有助于避免匹配字符的序列，并显示精确的输入字。在

import re
matches = []
categories_to_retain = ['SOLID',
     'GEOMETRIC',
     'FLORAL',
     'BOTANICAL',
     'STRIPES',
     'ABSTRACT',
     'ANIMAL',
     'GRAPHIC PRINT',
     'ORIENTAL',
     'DAMASK',
     'TEXT',
     'CHEVRON',
     'PLAID',
     'PAISLEY',
     'SPORTS']

x = " Beautiful Art By  Design Studio **graphic print** Creates A **TEXT** Design For This Art Driven Duvet. Printed In Remarkable Detail On A Woven Duvet, This Is An Instant Focal Point Of Any Bedroom. The Fabric Is Woven Of Easy Care Polyester And Backed With A Soft Poly/Cotton Blend Fabric. The Texture In The Fabric Gives Dimension And A Unique Look And Feel To The Duvet."

x = x.upper()

print(x)

def searchWholeWord(w):
    return re.compile(r'\b({0})\b'.format(w), flags=re.IGNORECASE).search

for cat in categories_to_retain:
    return_value = searchWholeWord(cat)(x)
    if return_value:
        matches.append(cat)

print(matches)

输出：

^{pr2}$

网友

3楼 · 编辑于 2024-07-07 08:56:38

这里使用默认split（）拆分字符串，这意味着它将在每个空格处拆分：x.split（）中有字符串“GRAPHIC”和“PRINT”，但没有“GRAPHIC PRINT”。您可能需要使用“if cat in x”，我相信在这种情况下它会返回您所需要的。在

这应该是有效的：

matches = [cat for cat in categories_to_retain if cat in x]

相关问题更多 >

编程相关推荐

热门问题

热门文章