当输入一个开始标记时，lxml的解析器目标不会立即触发“start”回调

class EchoTarget(object): def start(self, tag, attrib): print("start %s %s" % (tag, attrib)) def end(self, tag): print("end %s" % tag) def data(self, data): print("data %r" % data) def comment(self, text): print("comment %s" % text) def close(self): print("close") return "closed!" >>> p = etree.XMLParser(target=EchoTarget()) >>> p.feed('<a>') # nothing happens >>> p.feed(' ') # suddenly.. start a {} >>> p.feed('<b>') # works as expected data u' ' start b {}

1条回答

网友

1楼 · 发布于 2024-10-04 05:28:37

从阅读文档来看，这似乎是预期的行为（引自http://lxml.de/parsing.html#the-feed-parser-interface）：

"If you do not call close(), the parser will stay locked and subsequent feeds will keep appending data, usually resulting in a non well-formed document and an unexpected parser error. So make sure you always close the parser after use, also in the exception case."

所以解析器正在“等待”更多的信息被输入或关闭。您可以通过调用close方法来验证所输入的内容是否为有效的XML：

>>> p.feed('<a>')
>>> p.close()
start a {}
close
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "parser.pxi", line 1171, in lxml.etree._FeedParser.close (src/lxml/lxml.etree.c:79791)
  File "parsertarget.pxi", line 128, in lxml.etree._TargetParserContext._handleParseResult (src/lxml/lxml.etree.
c:88895)
  File "parser.pxi", line 590, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:74696)
XMLSyntaxError: Extra content at the end of the document, line 1, column 4

例如，关闭打开的标记（有效的XML）将生成：

^{pr2}$

希望这有帮助。在

相关问题更多 >

编程相关推荐

热门问题

热门文章