如何在python中访问xml属性

2024-10-01 11:21:26 发布

您现在位置:Python中文网/ 问答频道 /正文

下面给出了我的XML文件示例,我想访问文本“The bread is top notch also”和类别“food”

<sentences>
     <sentence id="32897564#894393#2">
         <text>The bread is top notch as well.</text>
         <aspectTerms>
             <aspectTerm term="bread" polarity="positive"  from="4" to="9"/>
         </aspectTerms>
         <aspectCategories>
             <aspectCategory category="food" polarity="positive" />
         </aspectCategories>
     </sentence>

我的密码是

 test_text_file=open('Restaurants_Test_Gold.txt', 'rt')
 test_text_file1=test_text_file.read()
 root = ET.fromstring(test_text_file1)
 for page in list(root):
     text = page.find('text').text
     Category = page.find('aspectCategory')
     print ('sentence: %s; category: %s' % (text,Category))
 test_text_file.close()

Tags: thetexttestfoodistoppagesentence
2条回答

这是我解决你问题的代码

import os
import xml.etree.ElementTree as ET


basedir = os.path.abspath(os.path.dirname(__file__))
filenamepath = os.path.join(basedir, 'Restaurants_Test_Gold.txt')

test_text_file = open(filenamepath, 'r')
file_contents = test_text_file.read()

tree = ET.fromstring(file_contents)

for sentence in list(tree):
    sentence_items = list(sentence.iter())
    # remove first element because it's the sentence element [<sentence>] itself
    sentence_items = sentence_items[1:]
    for item in sentence_items:
        if item.tag == 'text':
            print(item.text)
        elif item.tag == 'aspectCategories':
            category = item.find('aspectCategory')
            print(category.attrib.get('category'))

test_text_file.close()

希望有帮助

这取决于XML格式的复杂程度。最简单的方法是直接访问路径

import xml.etree.ElementTree as ET

tree = ET.parse('x.xml')
root = tree.getroot()

print(root.find('.//text').text)
print(root.find('.//aspectCategory').attrib['category'])

但是如果有类似的标记,您可能需要使用更长的路径,比如.//aspectCategories/aspectCategory

相关问题 更多 >