如何捕捉以破折号结尾的单词及其后面的单词?正则表达式

2024-10-17 08:26:06 发布

您现在位置:Python中文网/ 问答频道 /正文

我有这样一段文字:

IN THE 18th century, suicide was regard- ed, particularly by the French, as an English disease. 'The English destroy themselves most unaccountably,' wrote Montesquieu, and Voltaire was told that during an East wind the English hanged themselves by the dozen. True or not, the chaussure is now on the other foot. The suicide rate for men in England and Wales is about 10 per 100,000 inhabitants, com- pared with 30 in France

我想抓住这些例子:

regard- ed
com- pared 

我尝试了r'\s[a-z].*-[a-z].*\s'捕捉一个单词,然后是一个破折号和另一个单词,但这是不对的。在

我试过r'\s[a-z].*-',它捕捉到:

^{pr2}$

然后我试着:r'-\s[a-z].*\s'它捕捉到:

- pared with 30 in France.

我可以试试:

left = re.find(r'\s[a-z].*-', text).rpartition(' ')[2] 
right = re.find(r'-\s[a-z].*\s', text).partition(' ')[0]  
left[:-1] + right[2:]

但我确信有一种单一的regex方法可以避免所有的分区混乱。那么如何用一个正则表达式捕捉所需的实例呢?(假设单词末尾的破折号总是表示所需的实例,但不希望使用空格填充的破折号,例如com - pared


Tags: theincomanbyenglish单词was
2条回答

尝试将文本存储在一个字符串中,例如

str = 'Suicide was regard- ed'

然后呢

^{pr2}$

我的第一次尝试是正确的重新模式匹配我在文本中看到的:

r'\w+- \w+'

似乎工作得很好。具有此模式的re.findall返回

^{pr2}$

相关问题 更多 >