
2024-10-06 12:08:20 发布

您现在位置:Python中文网/ 问答频道 /正文

这是xml文件 http://www.diveintopython3.net/examples/feed.xml


from lxml import etree
def lxml():
    tree = etree.parse('feed.xml')
    NSMAP = {"nn":"http://www.w3.org/2005/Atom"}
    test = tree.xpath('//nn:category[@term="html"]/..',namespaces=NSMAP)
    for elem in tree.iter():
    test1 = tree.xpath('//nn:category',namespaces=NSMAP)
    for node in test1:
        test2 = node.xpath('./../nn:summary',namespaces=NSMAP) # return a list
    test3 = tree.xpath('//text()[normalize-space(.)]')# [normalize-space()] only remove the heading and tailing


['Putting an entire chapter on one page sounds\n    bloated, but consider this — my longest chapter so far\n    would be 75 printed pages, and it loads in under 5 seconds…\n    On dialup.']
['Putting an entire chapter on one page sounds\n    bloated, but consider this — my longest chapter so far\n    would be 75 printed pages, and it loads in under 5 seconds…\n    On dialup.']
['Putting an entire chapter on one page sounds\n    bloated, but consider this — my longest chapter so far\n    would be 75 printed pages, and it loads in under 5 seconds…\n    On dialup.']
['The accessibility orthodoxy does not permit people to\n      question the value of features that are rarely useful and rarely used.']
['These notes will eventually become part of a\n      tech talk on video encoding.']
['These notes will eventually become part of a\n      tech talk on video encoding.']
['These notes will eventually become part of a\n      tech talk on video encoding.']
['These notes will eventually become part of a\n      tech talk on video encoding.']
['These notes will eventually become part of a\n      tech talk on video encoding.']
['These notes will eventually become part of a\n      tech talk on video encoding.']
['These notes will eventually become part of a\n      tech talk on video encoding.']
['These notes will eventually become part of a\n      tech talk on video encoding.']
['\n  ', 'dive into mark', '\n  ', 'currently between addictions', '\n  ', 'tag:diveintomark.org,2001-07-29:/', '\n  ', '2009-03-27T21:56:07Z', '\n  ', '\n  ', '\n  ', '\n    ', '\n      ', 'Mark', '\n      ', 'http://diveintomark.org/', '\n    ', '\n    ', 'Dive into history, 2009 edition', '\n    ', '\n    ', 'tag:diveintomark.org,2009-03-27:/archives/20090327172042', '\n    ', '2009-03-27T21:56:07Z', '\n    ', '2009-03-27T17:20:42Z', '\n    ', '\n    ', '\n    ', '\n  ', 'Putting an entire chapter on one page sounds\n    bloated, but consider this — my longest chapter so far\n    would be 75 printed pages, and it loads in under 5 seconds…\n    On dialup.', '\n  ', '\n  ', '\n    ', '\n      ', 'Mark', '\n      ', 'http://diveintomark.org/', '\n    ', '\n    ', 'Accessibility is a harsh mistress', '\n    ', '\n    ', 'tag:diveintomark.org,2009-03-21:/archives/20090321200928', '\n    ', '2009-03-22T01:05:37Z', '\n    ', '2009-03-21T20:09:28Z', '\n    ', '\n    ', 'The accessibility orthodoxy does not permit people to\n      question the value of features that are rarely useful and rarely used.', '\n  ', '\n  ', '\n    ', '\n      ', 'Mark', '\n    ', '\n    ', 'A gentle introduction to video encoding, part 1: container formats', '\n    ', '\n    ', 'tag:diveintomark.org,2008-12-18:/archives/20081218155422', '\n    ', '2009-01-11T19:39:22Z', '\n    ', '2008-12-18T15:54:22Z', '\n    ', '\n    ', '\n    ', '\n    ', '\n    ', '\n    ', '\n    ', '\n    ', '\n    ', 'These notes will eventually become part of a\n      tech talk on video encoding.', '\n  ', '\n']..




Tags: andoforgonvideowilltechencoding




"My question is why there are so many '\n'. how to delete them?"




"additional question is how to directly query the tag of a text, such as make to get the node of "Mark" ( the child of entry's text."



相关问题 更多 >