使用python遍历xml以查找具有特定扩展名的url

i=0 url = '' while( i < len(xmlTag)): if re.search('*.txt', xmlTag[i].toxml() ) is not None: url = xmlTag[i].toxml() i = i + 1; ** Some code that parses out the url **

2条回答

网友

1楼 · 编辑于 2024-05-19 16:35:08

使用lxml、urlparse和os.path的示例：

from lxml import etree
from urlparse import urlparse
from os.path import splitext

data = """
<Foo>
    <bar>
        <file url="http://foo.txt"/>
        <file url="http://bar.doc"/>
    </bar>
</Foo>
"""

tree = etree.fromstring(data).getroottree()
for url in tree.xpath('//Foo/bar/file/@url'):
    spliturl = urlparse(url)
    name, ext = splitext(spliturl.netloc)
    print url, 'is is a', ext, 'file'

网友

2楼 · 编辑于 2024-05-19 16:35:08

坦白说，你最后一段代码很恶心。dom.getElementsByTagName('file')提供树中所有<file>元素的列表。。。重复一遍就行了。在

urls = []
for file_node in dom.getElementsByTagName('file'):
    url = file_node.getAttribute('url')
    if url.endswith('.txt'):
        urls.append(url)

顺便说一句，您永远不应该使用Python手动编制索引。即使在极少数情况下需要索引号，也只需使用enumerate：

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章

使用python遍历xml以查找具有特定扩展名的url

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >