混合xml/text解析python - 问答 - Python中文网

混合xml/text解析python

2024-09-26 17:36:00 发布

您现在位置：Python中文网/ 问答频道 /正文

男 | 程序猿一只，喜欢编程写python代码。

我需要用这种难看的格式解析一些日志文件（任意数量的纯文本标头，其中一些标头以xml形式获取附加数据）：

[dd/mm/yy]:message_data
<starttag>
    <some_field>some_value</some_field>
     ....
</starttag>
[dd/mm/yy]:message_data
[dd/mm/yy]:message_data
....

到目前为止，我的方法是：

^{pr2}$

在哪里

MESSAGE_START_RE = re.compile(r"<starttag.*>)
MESSAGE_END_RE = re.compile(r"</starttag>)
header_info is a regex with named fields of the message

你知道更好的方法吗？

这种方法的问题是：我在用regex解析xml（这很愚蠢）。有没有可以识别文件中xml开头和结尾的包？在

Tags：文件方法 re field message data some xml

1条回答

网友

1楼 · 发布于 2024-09-26 17:36:00

您仍然可以在难看的xml上使用BeautifulSoup。下面是一个例子：

from bs4 import BeautifulSoup

data = """[dd/mm/yy]:message_data
<starttag>
    <some_field>some_value</some_field>
     ....
</starttag>
[dd/mm/yy]:message_data
[dd/mm/yy]:message_data"""

soup = BeautifulSoup(data);
starttag = soup.findAll("starttag")
for tag in starttag:
    print tag.find("some_field").text
    # => some_value

相关问题更多 >

编程相关推荐

热门问题

热门文章