ValueError:太多值无法解压缩（预期为3个）与Python匹配的正则表达式 - 问答 - Python中文网

ValueError:太多值无法解压缩（预期为3个）与Python匹配的正则表达式

2024-10-06 08:35:22 发布

您现在位置：Python中文网/ 问答频道 /正文

男 | 程序猿一只，喜欢编程写python代码。

在我的Python代码中，我有一个字符串，并试图查找该字符串是否包含特定的模式（名称后面是数字）。为此，我使用re.match然后groups()它来获得这样的所需结果

authors_and_year = re.match('(.*)\. (\d{4})\.', line)
texts, authors, year = authors_and_year.groups()

如果我有一根这样的线

Regina Barzilay and Lillian Lee. 2004. Catching the drift: Probabilistic content models, with applications to generation and summarization. In Proceedings of NAACL-HLT.

它将返回我这个（如预期的那样）

('Regina Barzilay and Lillian Lee. 2004.', 'Regina Barzilay and Lillian Lee', '2004')

但在某些情况下，我有这样的字符串

J. Cohen. 1968a. Weighted kappa: Nominal scale agreement with provision for scaled disagreement or partial credit. volume 70, pages 213–220

或者这个,

Ralph Weischedel, Jinxi Xu, and Ana Licuanan. 1968b. A hybrid approach to answering biographical questions. In Mark Maybury, editor, New Directions In Question Answering, chapter 5. AAAI Press

当年份有字母表时，因此上层正则表达式在这里失败。为了处理这个场景，我尝试添加一个新的正则表达式，如下所示

authors_and_year = re.match('((.*)\. (\d{4})\.|(.*)\. (\d{4})(a-z){1}\.)', line)
texts, authors, year = authors_and_year.groups()

但它给了我这个错误

ValueError: too many values to unpack (expected 3)

当我检查authors_and_year值时，它是这样的

('Regina Barzilay and Lillian Lee. 2004.', 'Regina Barzilay and Lillian Lee', '2004', None, None, None)

我不知道最后3None个值是从哪里来的。谁能告诉我我做错了什么

Tags： and to 字符串 in re none match line

2条回答

网友
1楼 · 编辑于 2024-10-06 08:35:22

这就是团队处理|的方式None来自第二种选择。见：
>>> re.match('(foo)|(bar)', 'foo').groups() ('foo', None) >>> re.match('(foo)|(bar)', 'bar').groups() (None, 'bar')
您可以筛选出不匹配项：
>>> [group for group in re.match('(foo)|(bar)', 'foo').groups() if group is not None] ['foo'] >>> [group for group in re.match('(foo)|(bar)', 'bar').groups() if group is not None] ['bar']
或者，您可以使用命名组：
>>> match = re.match('(?P<first>foo)|(?P<second>bar)', 'foo') >>> res = match.groupdict()["first"] or match.groupdict()["second"] >>> res 'foo' >>> match = re.match('(?P<first>foo)|(?P<second>bar)', 'bar') >>> res = match.groupdict()["first"] or match.groupdict()["second"] >>> res 'bar'
如果可能存在空匹配（组=空字符串），则此代码将不起作用；你需要做一些类似的事情
... res = match.groupdict()["first"] if res is None: res = match.groupdict()["second"]

网友
2楼 · 编辑于 2024-10-06 08:35:22

您的正则表达式可以简化为((.*)\.[ ](\d{4})[a-z]?\.)
这使得年后的字母是可选的，同时保留3个捕获组

相关问题更多 >

编程相关推荐

热门问题

热门文章