正则表达式将子模式“hello1,2,3”替换为“hello”,而不影响其他编号模式

2024-09-28 21:06:49 发布

您现在位置:Python中文网/ 问答频道 /正文

抱歉,我对正则表达式没有经验

我想去掉字符串中的所有子模式,这些子模式的特点是在单词('hello1')后面有一个数字,或者在单词('hello1,2,3')后面有一系列数字,并用原来的单词(hello)替换这个模式。你知道吗

下面是我的文字:

x='interspersed by 1,6-hexanediol, corresponds to the immobile component observed with FRAP. Consistent with this idea, the immobile fraction of HP 1a.... own to arise through phase separation23. N&B analysis of GFP–fibrillarin highlighted areas of consistently high variance (2.38 ± 0.46 -mers) at the nucleolar boundary, compared to inside (1.28 ± 0.36) or outside (1.17 ± 0.25) the domain24, 25, 26. Similarly, GFP–HP 1a displayed increased variance34, 37'

x =re.sub(r'([^ 0-9])(\d+(?:, \d+)*)', r'\1', x)

在上面,我使用正则表达式来消除一个单词前面带有数字的模式,但是它也会产生不想要的效果:

interspersed by 1,-hexanediol, corresponds to the immobile component observed with FRAP. Consistent with this idea, the immobile fraction of HP 1a.... own to arise through phase separation. N&B analysis of GFP\xe2\x80\x93fibrillarin highlighted areas of consistently high variance (.\xe2\x80\x89\xc2\xb1\xe2\x80\x89. -mers) at the nucleolar boundary, compared to inside (.\xe2\x80\x89\xc2\xb1\xe2\x80\x89.) or outside (.\xe2\x80\x89\xc2\xb1\xe2\x80\x89.) the domain. Similarly, GFP\xe2\x80\x93HP 1a displayed increased variance

预期输出为:

x='interspersed by 1,6-hexanediol, corresponds to the immobile component observed with FRAP. Consistent with this idea, the immobile fraction of HP 1a.... own to arise through phase separation. N&B analysis of GFP–fibrillarin highlighted areas of consistently high variance (2.38 ± 0.46 -mers) at the nucleolar boundary, compared to inside (1.28 ± 0.36) or outside (1.17 ± 0.25) the domain. Similarly, GFP–HP 1a displayed increased variance'

保留“1,6-己二醇”、“1.28± 0.36”和“HP 1a”等模式,而不删除数字

更新:

这个表达式似乎无法完全摆脱包含连字符的模式(例如word11-12)。你知道吗

x='than allelic variant and define eQTLs9–11'

x = re.sub(r"(?<=\w)\d+(?:, \d+)*", "", x)

结果

than allelic variant and define eQTLs–1

预期产量:

than allelic variant and define eQTLs

有人能帮我做到这一点吗?你知道吗


Tags: ofthetowith模式数字单词hp
1条回答
网友
1楼 · 发布于 2024-09-28 21:06:49
x = re.sub(r"(?<=\w)\d+(?:, \d+)*", "", x)

说明

(?<=...)是一个积极的回顾。它基本上是说“确保这是在这里,没有实际匹配它”。如果需要,还可以用普通组替换它,并在替换中使用\1。你知道吗

\w匹配“word”字符。这通常相当于[a-zA-Z_]。你知道吗

\d+匹配一个数字。你知道吗

(?:, \d+)*匹配逗号,后跟空格,后跟数字,零次或多次。你知道吗

相关问题 更多 >