如何将.txt文件解析为.xml文件？问题的回答

如何将.txt文件解析为.xml文件？

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

这是我的txt文件： <pre class="lang-none prettyprint-override"><code>In File Name: C:\Users\naqushab\desktop\files\File 1.m1 Out File Name: C:\Users\naqushab\desktop\files\Output\File 1.m2 In File Size: Low: 22636 High: 0 Total Process time: 1.859000 Out File Size: Low: 77619 High: 0 In File Name: C:\Users\naqushab\desktop\files\File 2.m1 Out File Name: C:\Users\naqushab\desktop\files\Output\File 2.m2 In File Size: Low: 20673 High: 0 Total Process time: 3.094000 Out File Size: Low: 94485 High: 0 In File Name: C:\Users\naqushab\desktop\files\File 3.m1 Out File Name: C:\Users\naqushab\desktop\files\Output\File 3.m2 In File Size: Low: 66859 High: 0 Total Process time: 3.516000 Out File Size: Low: 217268 High: 0 </code></pre> 我试图将其解析为如下XML格式： ^{pr2}$ 下面是我尝试实现这一目标的代码（我使用的是Python 2）： <pre><code>import re import xml.etree.ElementTree as ET rex = re.compile(r'''(?P<title>In File Name: |Out File Name: |In File Size: Low: |Total Process time: |Out File Size: Low: ) (?P<value>.*) ''', re.VERBOSE) root = ET.Element('root') root.text = '\n' # newline before the celldata element with open('Performance.txt') as f: celldata = ET.SubElement(root, 'filedata') celldata.text = '\n' # newline before the collected element celldata.tail = '\n\n' # empty line after the celldata element for line in f: # Empty line starts new celldata element (hack style, uggly) if line.isspace(): celldata = ET.SubElement(root, 'filedata') celldata.text = '\n' celldata.tail = '\n\n' # If the line contains the wanted data, process it. m = rex.search(line) if m: # Fix some problems with the title as it will be used # as the tag name. title = m.group('title') title = title.replace('&', '') title = title.replace(' ', '') e = ET.SubElement(celldata, title.lower()) e.text = m.group('value') e.tail = '\n' # Display for debugging ET.dump(root) # Include the root element to the tree and write the tree # to the file. tree = ET.ElementTree(root) tree.write('Performance.xml', encoding='utf-8', xml_declaration=True) </code></pre> 但是我得到的是空值，有可能将这个txt解析为XML吗？在

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

对正则表达式的更正：应该是 <pre><code>m = re.search('(?P<title>(In File Name)|(Out File Name)|(In File Size: *Low)|(Total Process time)|(Out File Size: *Low)):(?P<value>.*)',line) </code></pre> 而不是你所给予的。因为在regex中，<code>In File Name|Out File Name</code>意味着，它将检查<code>In File Nam</code>，但<code>e</code>或{<cd4>}后跟{<cd5>}等等。在 建议 你可以不使用正则表达式。 xml.dom.minidom可用于修饰xml字符串。在 为了更好地理解，我添加了注释！在 <blockquote> Node.toprettyxml([indent=""[, newl=""[, encoding=""]]]) Return a pretty-printed version of the document. indent specifies the indentation string and defaults to a tabulator; newl specifies the string emitted at the end of each line and defaults to </blockquote> 编辑 <blockquote> <pre><code>import itertools as it [line[0] for line in it.groupby(lines)] </code></pre> you can use groupby of itertools package to group consucutive dedup in list lines </blockquote> 所以 ^{pr2}$ 输出： 性能.xml <pre><code><?xml version="1.0" encoding="utf-8"?> <root> <filedata> <InFileName>File 1.m1</InFileName> <OutFileName>File 1.m2</OutFileName> <InFileSize>22636</InFileSize> <TotalProcesstime>1.859000</TotalProcesstime> <OutFileSize>77619</OutFileSize> </filedata> <filedata> <InFileName>File 2.m1</InFileName> <OutFileName>File 2.m2</OutFileName> <InFileSize>20673</InFileSize> <TotalProcesstime>3.094000</TotalProcesstime> <OutFileSize>94485</OutFileSize> </filedata> <filedata> <InFileName>File 3.m1</InFileName> <OutFileName>File 3.m2</OutFileName> <InFileSize>66859</InFileSize> <TotalProcesstime>3.516000</TotalProcesstime> <OutFileSize>217268</OutFileSize> </filedata> </root> </code></pre> 希望有帮助！在

如何将.txt文件解析为.xml文件？

1 个回答

相关Python问题