<p>对正则表达式的更正:应该是</p>
<pre><code>m = re.search('(?P<title>(In File Name)|(Out File Name)|(In File Size: *Low)|(Total Process time)|(Out File Size: *Low)):(?P<value>.*)',line)
</code></pre>
<p>而不是你所给予的。因为在regex中,<code>In File Name|Out File Name</code>意味着,它将检查<code>In File Nam</code>,但<code>e</code>或{<cd4>}后跟{<cd5>}等等。在</p>
<p>建议</p>
<p>你可以不使用正则表达式。
<strong>xml.dom.minidom</strong>可用于修饰xml字符串。在</p>
<p>为了更好地理解,我添加了注释!在</p>
<blockquote>
<p><strong>Node.toprettyxml([indent=""[, newl=""[, encoding=""]]])</strong></p>
<p>Return a pretty-printed version of the document. indent specifies the indentation string and defaults to a tabulator; newl specifies the string emitted at the end of each line and defaults to</p>
</blockquote>
<p><strong>编辑</strong></p>
<blockquote>
<pre><code>import itertools as it
[line[0] for line in it.groupby(lines)]
</code></pre>
<p>you can use groupby of itertools package to group consucutive dedup in list lines</p>
</blockquote>
<p>所以</p>
^{pr2}$
<p>输出:
<strong>性能.xml</strong></p>
<pre><code><?xml version="1.0" encoding="utf-8"?>
<root>
<filedata>
<InFileName>File 1.m1</InFileName>
<OutFileName>File 1.m2</OutFileName>
<InFileSize>22636</InFileSize>
<TotalProcesstime>1.859000</TotalProcesstime>
<OutFileSize>77619</OutFileSize>
</filedata>
<filedata>
<InFileName>File 2.m1</InFileName>
<OutFileName>File 2.m2</OutFileName>
<InFileSize>20673</InFileSize>
<TotalProcesstime>3.094000</TotalProcesstime>
<OutFileSize>94485</OutFileSize>
</filedata>
<filedata>
<InFileName>File 3.m1</InFileName>
<OutFileName>File 3.m2</OutFileName>
<InFileSize>66859</InFileSize>
<TotalProcesstime>3.516000</TotalProcesstime>
<OutFileSize>217268</OutFileSize>
</filedata>
</root>
</code></pre>
<p>希望有帮助!在</p>