将GraphML文件转换为anoth

<?xml version="1.0" ?> <graphml xmlns="http://graphml.graphdrawing.org/xmlns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns http://graphml.graphdrawing.org/xmlns/1.1/graphml.xsd"> <key id="weight" for="edge" attr.name="weight" attr.type="string"></key> <graph id="G" edgedefault="directed"> <node id="1"></node> <node id="2"> </node> <node id="3"> </node> <node id="4"> </node> <node id="5"> </node> <edge id="6" source="1" target="2"> <data key="weight">3</data> </edge> <edge id="7" source="2" target="4"> <data key="weight">1</data> </edge> <edge id="8" source="2" target="3"> <data key="weight">9</data> </edge> </graph> </graphml>

2条回答

网友

1楼 · 编辑于 2024-09-29 17:13:46

有一个python模块来处理graphml。奇怪的是，documentation没有remove或delete函数。在

因为graphml是xml标记，所以可以改用xml模块。我用过xmltodict，非常喜欢。此模块允许您将xml代码加载到python对象。修改对象后，可以将其保存回xml。在

如果data是包含xml的字符串：

data_object=xmltodict.parse(data)
del data_object["graphml"]["graph"]["node"]
xmltodict.unparse(data_object, pretty=True)

这将删除node项，unparse将返回一个包含xml的字符串。在

如果xml的结构变得更复杂，则需要搜索data_object中的节点。但这不应该是个问题，它只是一本有序的字典。在

另一个问题可能是xml的大小。3GB是很多。 xmltodict确实支持大文件的流模式，但这是我从未使用过的。在

网友

2楼 · 编辑于 2024-09-29 17:13:46

在阅读了一些链接之后，我想出了迭代解析的解决方案。但是，我无法从RAM使用方面找出简单解析和iterparse之间的区别。在

重要链接：
-http://www.ibm.com/developerworks/xml/library/x-hiperfparse/
-using lxml and iterparse() to parse a big (+- 1Gb) XML file

代码：

进口lxml.etree作为et

graphml = {  
   "graph": "{http://graphml.graphdrawing.org/xmlns}graph",  
   "node": "{http://graphml.graphdrawing.org/xmlns}node",  
   "edge": "{http://graphml.graphdrawing.org/xmlns}edge",  
   "data": "{http://graphml.graphdrawing.org/xmlns}data",  
   "weight": "{http://graphml.graphdrawing.org/xmlns}data[@key='weight']",  
   "edgeid": "{http://graphml.graphdrawing.org/xmlns}data[@key='edgeid']"  
}



for event, elem in et.iterparse("/data/sample.graphml",tag=graphml.get("edge"), events = ('end', )):  
    print(et.tostring(elem))
    elem.clear()
    while elem.getprevious() is not None:
        del elem.getparent()[0]

相关问题更多 >

编程相关推荐

热门问题

热门文章