查找任何带有逗号或sp的文本

str1 = '{5723647 9 aqua\t \tfem nom/voc pl}{5723647 9 aqua\t \tfem dat sg}{5723647 9 aqua\t \tfem gen sg}' str2 = '{27224035 2 equo_,equus#1\t \tmasc abl sg}{27224035 2 equo_,equus#1\t \tmasc dat sg}'

3条回答

网友

1楼 · 编辑于 2024-09-27 19:14:16

像这样的东西可能有用

([^{\s,]*)\t \t([^}]*)

网友

2楼 · 编辑于 2024-09-27 19:14:16

这将是少数人的观点，但是为什么不使用regex逻辑来处理那些更容易使用regex编写的事情，然后使用Python来处理其余的事情呢？除此之外，它对变化的适应性更强。像这样的

>>> import re
>>> 
>>> str1 = '{5723647 9 aqua\t \tfem nom/voc pl}{5723647 9 aqua\t \tfem dat sg}{5723647 9 aqua\t \tfem gen sg}'
>>> str2 = '{27224035 2 equo_,equus#1\t \tmasc abl sg}{27224035 2 equo_,equus#1\t \tmasc dat sg}'
>>> 
>>> pattern = re.compile("{([^\}]*)}")
>>> 
>>> def extract(part):
...     ps = part.split()
...     word = ps[2].split(',')[-1]
...     form = ' '.join(ps[3:])
...     return word, form
... 
>>> for s in str1, str2:
...     for entry in re.findall(pattern, s):
...         print extract(entry)
... 
('aqua', 'fem nom/voc pl')
('aqua', 'fem dat sg')
('aqua', 'fem gen sg')
('equus#1', 'masc abl sg')
('equus#1', 'masc dat sg')

网友

3楼 · 编辑于 2024-09-27 19:14:16

pattern = re.compile(r"\{(?:.*?,|.*?)(\S+)\t \t(.*?)\}")

相关问题更多 >

编程相关推荐

热门问题

热门文章

查找任何带有逗号或sp的文本

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >