<p>首先,您的xml缺少<code><TextLine custom="readingOrder {index:1;}" id="Ad0010100l2"></code>的结束标记,但是如果您将其插入适当的位置,那么以下内容应该可以帮助您:</p>
<pre><code>my_xml = """[your xml above, corrected]"""
data = ET.XML(my_xml.encode('ascii'))
for target in data.xpath("//*[local-name() = 'Unicode'][not(text())]"):
target.getparent().remove(target)
print(etree.tostring(data, xml_declaration=True))
</code></pre>
<p>输出:</p>
<pre><code> <?xml version=\'1.0\' encoding=\'ASCII\'?>\n
<PcGts
xmlns="http://schema.primaresearch.org/PAGE/gts/pagecontent/2013-07-15"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://schema.primaresearch.org/PAGE/gts/pagecontent/2013-07-15 http://schema.primaresearch.org/PAGE/gts/pagecontent/2013-07-15/pagecontent.xsd">
<Page imageFilename="1.png">
<TextRegion custom="a">
<TextLine custom="readingOrder {index:0;}" id="Ar0010001l1">
<TextEquiv>
<Unicode> abc </Unicode>
</TextEquiv>
</TextLine>
<TextLine custom="readingOrder {index:1;}" id="Ad0010100l2">
<TextEquiv/>
</TextLine>
</TextRegion>
</Page>
</PcGts>
</code></pre>