在完全匹配python正则表达式之前提取内部带有逗号的指示词?

2024-09-28 03:22:45 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在使用python3并尝试用regex提取字符串的特定部分。 这是字符串:

data = "'Star Wars: The Last Jedi (Theatrical Version)MPAA Rating: PG-13 (Parents Strongly Cautioned)|Closed Caption3.8 out of 5 stars4,738Prime Videofrom$2.99$2.99to rentStarring:Oscar Isaac,Mark Hamill,Daisy RidleyandJohn BoyegaDirected by:Rian JohnsonRuntime:151 minutes'"

只有“4738”这个数字总是出现在“Prime Videofrom$”之前,提取这个数字的最佳方法是什么? 这是我的密码:

import re
data = "'Star Wars: The Last Jedi (Theatrical Version)MPAA Rating: PG-13 (Parents Strongly Cautioned)|Closed Caption3.8 out of 5 stars4,738Prime Videofrom$2.99$2.99to rentStarring:Oscar Isaac,Mark Hamill,Daisy RidleyandJohn BoyegaDirected by:Rian JohnsonRuntime:151 minutes'"
reviews = re.findall("[stars][\d,]+\$",data)
print(reviews)

但我得到一张空名单:

[]

如何提取在完全匹配之前仅包含逗号的数字


Tags: the字符串dataversion数字starlastpg
2条回答

^{}对于单个匹配是最佳的:

考虑到您的条件“提取精确匹配前仅包含逗号的数字

import re

data = "'Star Wars: The Last Jedi (Theatrical Version)MPAA Rating: PG-13 (Parents Strongly Cautioned)|Closed Caption3.8 out of 5 stars4,738Prime Videofrom$2.99$2.99to rentStarring:Oscar Isaac,Mark Hamill,Daisy RidleyandJohn BoyegaDirected by:Rian JohnsonRuntime:151 minutes'"
m = re.search(r"\d+,\d+(?=Prime Videofrom\$)", data)
reviews = m.group() if m else m
print(reviews)   # 4,738

  • (?=Prime Videofrom\$)-lookahead position断言,确保前面的匹配(数字序列)后跟Prime Videofrom$

用途:

import re
data = "'Star Wars: The Last Jedi (Theatrical Version)MPAA Rating: PG-13 (Parents Strongly Cautioned)|Closed Caption3.8 out of 5 stars4,738Prime Videofrom$2.99$2.99to rentStarring:Oscar Isaac,Mark Hamill,Daisy RidleyandJohn BoyegaDirected by:Rian JohnsonRuntime:151 minutes'"
reviews = re.findall("(\d+,?\d*)Prime Videofrom\$",data)
print(reviews)   #  >['4,738']

相关问题 更多 >

    热门问题