删除python中变量的特定部分

网友

1楼 · 编辑于 2024-05-19 15:39:53

看起来您正在尝试进行模式化文本操作，正则表达式非常适合这种操作。很难从一个例子中概括出来——描述转换越精确，就越容易创建一个正则表达式来实现所需的功能。关于正则表达式的Python文档是一个有用的参考：https://docs.python.org/3/library/re.html

如果我必须从您的示例和描述中归纳出一个模式，我将精心设计以下正则表达式：

import re

myre = re.compile(
    r'([A-Za-z]+_[\d]+)' # This will match "scaffold_356" in the first group
    r'_[\d]+-[\d]+_\+_' # This will match "_1-1000_+_" ungrouped
    r'(_[A-Za-z]{3})' # This will match _Gen and put it in the second group
    r'[A-Za-z]*' # This will match any additional letters, ungrouped
    r'(_[A-Za-z]{3})' # This will match _Gen and put it in the third group
)

如果尝试使用此正则表达式，则可以看到它会将要构造的部分提取到最终结果中：

matches = myre.match('scaffold_356_1-1000_+__Genus_species')
print(''.join(matches)) # prints scaffold_356_Gen_spe

当然，这个正则表达式只适用于非常特定的模式，如果不严格遵守该模式，它将是不可原谅的。你知道吗

网友

2楼 · 编辑于 2024-05-19 15:39:53

你就快到了。我会把字符串分成前缀和后缀，分别修改它们，然后再把它们连接起来。你知道吗

import re
s = 'scaffold_356_1-1000_+__Genus_species'

#Split to suffix and prefix
suffix, prefix = s.split('__')
#scaffold_356_1-1000_+, Genus_species

#Get first three characters for prefix
modified_prefix = '_'.join([s[0:3] for s in prefix.split('_')])
#Gen_spe

#Do the regex replace for digits and remove the underscore and + at end of string
modified_suffix =re.sub(r'\d+\-\d*',"",suffix).rstrip('_+\\+')
#scaffold_356

#Join the strings back
final_s = modified_suffix  + '_' + modified_prefix
print(final_s)
#scaffold_356_Gen_spe

网友

3楼 · 编辑于 2024-05-19 15:39:53

可能不是最优雅的解决方案，但是假设您总是使用string\ 3digits\ 1digit-4digits\ uuu+\ uu string\ string的模式，它就可以工作了。你知道吗

import re

a_string = 'scaffold_356_1-1000_+__Genus_species'

new = re.findall('^([a-zA-Z]+_[0-9][0-9][0-9]_).+?_\+__([a-zA-Z][a-zA-Z][a-zA-Z]).*(_[a-zA-Z][a-zA-Z][a-zA-Z]).*', a_string)

print(''.join(list(new[0])))
# scaffold_356_Gen_spe

本例使用带有捕获组的regex模式。您可能需要使用regex来了解模式的结构。如果您插入这个regex模式，regex101将为您提供每一项的可理解的解释。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章