如何用正则表达式替换字符串列表中的字符串?

2024-09-28 20:55:15 发布

您现在位置:Python中文网/ 问答频道 /正文

my list = [
 '<instance id="line-nw8_059:8174:">',
 '  advanced micro devices inc sunnyvale calif and siemens ag of west germany '
 'said they agreed to jointly develop manufacture and market microchips for '
 'data communications and telecommunications with an emphasis on the '
 'integrated services digital network        the integrated services digital '
 'network or isdn is an international standard used to transmit voice data '
 'graphics and video images over telephone <head>line</head>   ',
 '<instance id="line-nw7_098:12684:">',
 '  in your may 21 story about the phone industry billing customers for '
 'unconnected calls i was surprised that you did not discuss whether such '
 'billing is appropriate    a caller who keeps a <head>line</head> open '
 'waiting for a connection uses communications switching and transmission '
 'equipment just as if a conversation were taking place  ',
 '<instance id="line-nw8_106:13309:">'
]

我必须用空格替换所有的<instance id="line-nw8_106:13309:">(任何变体),并将它们全部添加到自己的列表中。我已经想出了如何使用regex将它们添加到自己的列表中,如下所示:

instanceList =[]
    instanceMatch = '<instance id="([^"]*)"'
    for i in contentsTestSplit:
        matchy = re.match(instanceMatch,i)
        if matchy:
            instanceMatchy = matchy.group(0)
            instanceList.append(instanceMatchy)
    
    print("instance list: ",instanceList)
 

这是可行的,但我不知道如何用空格替换所有的空格?我已尝试使用替换方法进行此操作,但它不起作用,请提供任何帮助:

instanceList =[]
    instanceMatch = '<instance id="([^"]*)"'
    pat = re.compile(r'<instance id="([^"]*)"')
    for i in contentsTestSplit:
        matchy = re.match(instanceMatch,i)
        if matchy:
            instanceMatchy = matchy.group(0)
            instanceList.append(instanceMatchy)
            i = pat.sub("",i)
            
    
    print("instance list: ",instanceList)

也尝试过这样做:但它不会替换,但会准确定位引用

for i in contentsTestSplit:
        if i.startswith("<instance id="):
            i.replace(i,"")

Tags: andtheinstanceinidforifline
2条回答

可以使用带替换的正则表达式将所有实例替换为空白。然后,您可以向其传递一个自定义函数,以返回匹配项并将结果附加到实例列表中

def _sub(match):
    instanceList.append(match[0])
    return ''
    
instanceList =[]
instanceMatch = '<instance id="([^"]*)"'
for i in my_list:
    re.sub(instanceMatch, _sub, i)

我不知道您想对处理后的数据做什么,但是re.sub(instanceMatch, _sub, i)返回带有替换的文本

我想知道为什么 id="([^"])“表示一个字符串,而如果 id="[^"]”则不表示任何内容。什么事 使用()

相关问题 更多 >