在python中为复杂字符串设置regex - 问答 - Python中文网

在python中为复杂字符串设置regex

2024-06-01 21:25:31 发布

您现在位置：Python中文网/ 问答频道 /正文

男 | 程序猿一只，喜欢编程写python代码。

我有一系列这样的产品成分：

text = 'Pork and beef, water, salt (1,7%), spices (white pepper, nutmeg, coriander, cardamom), stabilizer (E450), glucose, antioxidant (E316), a preservative (E250), flavorings'

我想检测它的所有文本（成分），这样它应该像这样

ingredientsList= ['Pork and beef', 'salt', 'spices', 'white pepper', 'nutmeg', 'coriander', 'cardamom', 'stabilizer', 'glucose', 'antioxidant', 'preservative', 'flavorings']

我在这里使用的当前正则表达式如下：

ingredients = re.findall(r'\([^()]*\)|([^\W\d]+(?:\s+[^\W\d]+)*)', text)

但它没有提供括号中的文本。我只是不想包括代码和百分比，但要在括号内的所有成分。我该怎么办？提前谢谢

Tags： and text salt 成分 white pork beef stabilizer

1条回答

网友

1楼 · 发布于 2024-06-01 21:25:31

您可以将第一个分支限制为只匹配以E开头并后跟数字的代码：

\(E\d+\)|([^\W\d]+(?:\s+[^\W\d]+)*)

参见regex demo

现在，\(E\d+\)将只匹配类似(Exxx)的子字符串，其他子字符串将被处理。您也可以在这里添加百分比，以显式跳过它们-^{}

import re
rx = r"\(E\d+\)|([^\W\d]+(?:\s+[^\W\d]+)*)"
s = "Pork and beef, water, salt (1,7%), spices (white pepper, nutmeg, coriander, cardamom), stabilizer (E450), glucose, antioxidant (E316), a preservative (E250), flavorings"
res = [x for x in re.findall(rx, s) if x]
print(res)

相关问题更多 >

编程相关推荐

热门问题

热门文章