在Python中,当元素值匹配且重复时,如何删除子根节点

2024-05-20 17:32:40 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图解析一个XML,发现重复的值。但是如果它们在python中重复,我需要删除整个元素块。 例如:

<?xml version="1.0" encoding="UTF-8"?><group>
<list-service uri="sip:accc@msg.pc.t-data.com"/>
<hunt xmlns:ht="http://www.t-data.com/xml/hunt" uri="sip:17738078709@msg.pc.t-data.com">
<ht:list>
<ht:huntItem>
<ht:huntUri>17753720@msg.pc.t-data.com</ht:huntUri>
<ht:userId>U-1-f0c8-431c-84fa-6f0dfc6b22de</ht:userId>
</ht:huntItem>
<ht:huntItem>
<ht:huntUri>19462562@msg.pc.t-data.com</ht:huntUri>
<ht:userId>U-1-f0c8-431c-84fa-6f0dfc6b22de</ht:userId>
</ht:huntItem>
<ht:huntItem>
<ht:huntUri>15668433@msg.pc.t-data.com</ht:huntUri>
<ht:userId>U-1-f0c8-431c-84fa-6f0dfc6b22de</ht:userId>
<ht:deviceId>urnmei:-131893-0</ht:deviceId>
</ht:huntItem>
<ht:huntItem>
<ht:huntUri>15668433@msg.pc.t-data.com</ht:huntUri>
<ht:userId>U-1-f0c8-431c-84fa-6f0dfc6b22de</ht:userId>
<ht:deviceId>urnmei:35775808-001226-0</ht:deviceId>
</ht:huntItem>
</ht:list>
</hunt>
</group>

从上面的XML中,我们需要检查15668433@msg.pc.t-data.com

<ht:huntUri>15668433@msg.pc.t-data.com</ht:huntUri> 

如果发现重复,则删除

我能找到下面数据的列表

def getChildUsers(source,string):
try:
    result=[]
    i=0
    data=minidom.parseString(source)
    elementlist=data.getElementsByTagName(string)
    for att in elementlist:
        result.append(att.firstChild.nodeValue)
    return result
except:
    print('users fetch issue')
    #print string
    #raise

Tags: comdatastringmsgxmlresultlistht
1条回答
网友
1楼 · 发布于 2024-05-20 17:32:40

我能够实现它通过使用下面的代码,希望它能帮助别人

    for i in userList:
        #print i
        found = 0
        if i not in dataF:
            dataF.append(i)
        else:
            matchF.append(i)
            #print matchF
    #print userList 
    #print dataF
    print matchF
    if len(matchF) > 0:
        for page in root:                     # iterate over pages
            elems_to_remove = []
            for elem in page:
                for dat in matchF:
                    #print dat
                    num = 0
                    for ev in elem:
                        for e in ev:
                            #print e.tag
                            if e.tag.split('}')[1]=='huntUri' and dat==e.text:
                                num = num+1
                                break
                                if num == 2:
                                    break;
                                #print e.text
                        if num>0:
                            #print dat,e.text
                            #print dir(ev),ev.getchildren
                            print num
                            num=0
                           # for er in ev:
                           #     print er.text
                            elem.remove(ev)
                            break;
        tree.write("out.xml")
        writeF=open(processed_path+"/"+number+"_Final.xml","w")
        dataFile='++'.join(a.strip() for a in open('out.xml','r').readlines())
        data1=dataFile.replace('ns1:','ht:').replace(':ns0','').replace('ns0:','').replace(':ns1',':ht')
        listData=data1.split('++')
        listData[0]='<group'+namespace+'>'

相关问题 更多 >