<pre><code>my list = [
'<instance id="line-nw8_059:8174:">',
' advanced micro devices inc sunnyvale calif and siemens ag of west germany '
'said they agreed to jointly develop manufacture and market microchips for '
'data communications and telecommunications with an emphasis on the '
'integrated services digital network the integrated services digital '
'network or isdn is an international standard used to transmit voice data '
'graphics and video images over telephone <head>line</head> ',
'<instance id="line-nw7_098:12684:">',
' in your may 21 story about the phone industry billing customers for '
'unconnected calls i was surprised that you did not discuss whether such '
'billing is appropriate a caller who keeps a <head>line</head> open '
'waiting for a connection uses communications switching and transmission '
'equipment just as if a conversation were taking place ',
'<instance id="line-nw8_106:13309:">'
]
</code></pre>
<p>我必须用空格替换所有的<code><instance id="line-nw8_106:13309:"></code>(任何变体),并将它们全部添加到自己的列表中。我已经想出了如何使用regex将它们添加到自己的列表中,如下所示:</p>
<pre><code>instanceList =[]
instanceMatch = '<instance id="([^"]*)"'
for i in contentsTestSplit:
matchy = re.match(instanceMatch,i)
if matchy:
instanceMatchy = matchy.group(0)
instanceList.append(instanceMatchy)
print("instance list: ",instanceList)
</code></pre>
<p>这是可行的,但我不知道如何用空格替换所有的空格?我已尝试使用替换方法进行此操作,但它不起作用,请提供任何帮助:</p>
<pre><code>instanceList =[]
instanceMatch = '<instance id="([^"]*)"'
pat = re.compile(r'<instance id="([^"]*)"')
for i in contentsTestSplit:
matchy = re.match(instanceMatch,i)
if matchy:
instanceMatchy = matchy.group(0)
instanceList.append(instanceMatchy)
i = pat.sub("",i)
print("instance list: ",instanceList)
</code></pre>
<p>也尝试过这样做:但它不会替换,但会准确定位引用</p>
<pre><code>for i in contentsTestSplit:
if i.startswith("<instance id="):
i.replace(i,"")
</code></pre>