我有这样一段文字:
IN THE 18th century, suicide was regard- ed, particularly by the French, as an English disease. 'The English destroy themselves most unaccountably,' wrote Montesquieu, and Voltaire was told that during an East wind the English hanged themselves by the dozen. True or not, the chaussure is now on the other foot. The suicide rate for men in England and Wales is about 10 per 100,000 inhabitants, com- pared with 30 in France
我想抓住这些例子:
regard- ed
com- pared
我尝试了r'\s[a-z].*-[a-z].*\s'
捕捉一个单词,然后是一个破折号和另一个单词,但这是不对的。在
我试过r'\s[a-z].*-'
,它捕捉到:
然后我试着:r'-\s[a-z].*\s'
它捕捉到:
- pared with 30 in France.
我可以试试:
left = re.find(r'\s[a-z].*-', text).rpartition(' ')[2]
right = re.find(r'-\s[a-z].*\s', text).partition(' ')[0]
left[:-1] + right[2:]
但我确信有一种单一的regex方法可以避免所有的分区混乱。那么如何用一个正则表达式捕捉所需的实例呢?(假设单词末尾的破折号总是表示所需的实例,但不希望使用空格填充的破折号,例如com - pared
)
尝试将文本存储在一个字符串中,例如
然后呢
^{pr2}$我的第一次尝试是正确的重新模式匹配我在文本中看到的:
似乎工作得很好。具有此模式的
^{pr2}$re.findall
返回相关问题 更多 >
编程相关推荐