从XML标记获取URL

<xml xmlns="http://www.myweb.org/2003/instance" xmlns:link="http://www.myweb.org/2003/linkbase" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:iso4217="http://www.myweb.org/2003/iso4217" xmlns:utr="http://www.myweb.org/2009/utr"> <link:schemaRef xlink:type="simple" xlink:href="http://www.myweb.com/form/2020-01-01/test.xsd"></link:schemaRef>

from lxml import etree with open(filepath,'rb') as f: file = f.read() root = etree.XML(file) print(root.nsmap["link"]) #http://www.myweb.org/2003/linkbase print(root.find(".//{"+root.nsmap["link"]+"}"+"schemaRef"))

2条回答

网友

1楼 · 编辑于 2024-10-01 15:43:52

用这种方法试试看是否有效：

for i in root.xpath('//*/node()'):
if isinstance(i,lxml.etree._Element):
     print(i.values()[1])

输出：

http://www.myweb.com/form/2020-01-01/test.xsd

网友

2楼 · 编辑于 2024-10-01 15:43:52

使用：

>>> child = root.getchildren()[0]
>>> child.attrib
{'{http://www.w3.org/1999/xlink}type': 'simple', '{http://www.w3.org/1999/xlink}href': 'http://www.myweb.com/form/2020-01-01/test.xsd'}
>>> url = child.attrib['{http://www.w3.org/1999/xlink}href']

然而，我认为挑战在于您是否知道使用哪个键（即{http://www.w3.org/1999/xlink}href）是正确的。如果这是问题所在，那么我们只需要：

>>> print(root.nsmap['xlink'])   # Notice that the requested url is a href to the xlink
'http://www.w3.org/1999/xlink'
>>> key_url = "{"+key_prefix+"}href"
>>> print(child.attrib[key_url])
'http://www.myweb.com/form/2020-01-01/test.xsd'

相关问题更多 >

编程相关推荐

热门问题

热门文章