Python ListComparator包_程序模块 - PyPI

比较有序列表、XML和CSV应用程序

ListComparator的Python项目详细描述

内容

Detailed Documentation
Contributors
- with contributions of
Change history

Detailed Documentation

XML and CSV comparisons

提供了两个脚本：xml-cmp和csv-cmp 它们都比较两个文件并输出delta作为文件suppr，文件加载项和文件更改

扩展分别被强制为xml或csv

List comparison

ListComparator提供了一个Comparator对象，该对象允许查找差异在两个列表之间，前提是列表的元素以相同的顺序出现

>>> old = [1, 2, 3, 4, 5, 6]
>>> new = [1, 3, 4, 7, 6]

>>> from listcomparator.comparator import Comparator

让我们创建一个比较器对象

>>> comp = Comparator(old,new)

check方法为additions和deletions属性提供值

>>> comp.check()
>>> comp.additions
[7]
>>> comp.deletions
[2, 5]

我们还可以使用列表列表

>>> old_list = [['62145', 'azerty'], ['1234', 'qwerty'], ['9876', 'ipsum']]
>>> new_list = [['62145', 'azerty'], ['1234', 'qwertw'], ['4865', 'lorem']]
>>> comp = Comparator(old_list, new_list)
>>> comp.check()
>>> comp.additions
[['1234', 'qwertw'], ['4865', 'lorem']]
>>> comp.deletions
[['1234', 'qwerty'], ['9876', 'ipsum']]

我们可以有一个问题，当一个修改，在我们的情况下“qwerty”变成“qwertz”，出现在两个输出中，comp.additions和comp.deletions。你可能会认为这是一个改变。比较器可以处理这个问题，如果您提供一个函数告诉比较器如何识别这种情况在我们的示例中，如果列表是相同的，一种ID。

>>> def my_key(x):
...     return x[0]
...

然后getchanges方法提供一个新属性：changes

>>> comp.getChanges(my_key)
>>> comp.changes
[['1234', 'qwertw']]

当然，添加和删除保持不变

>>> comp.additions
[['1234', 'qwertw'], ['4865', 'lorem']]
>>> comp.deletions
[['1234', 'qwerty'], ['9876', 'ipsum']]

您可能只想考虑“纯”的添加和删除 getchanges允许关键字参数'purge'这样做

>>> comp.getChanges(my_key, purge=True)
>>> comp.changes
[['1234', 'qwertw']]
>>> comp.additions
[['4865', 'lorem']]
>>> comp.deletions
[['9876', 'ipsum']]

新旧属性存储要比较的列表您可能想重置这些，comparator提供了一个purgeoldnew方法清除内存

>>> comp.old
[['62145', 'azerty'], ['1234', 'qwerty'], ['9876', 'ipsum']]
>>> comp.new
[['62145', 'azerty'], ['1234', 'qwertw'], ['4865', 'lorem']]
>>> comp.purgeOldNew()
>>> comp.old
>>> comp.new

compare XML files

比较器可用于比较XML文件让我们制作两个描述书籍的xml文件

>>> old='''<?xml version="1.0" ?>
... <infos>
... <book><title>White pages 1995</title>
... <author>
... <surname>La Poste</surname>
... </author>
... <chapter><title>Paris</title>
... <para>ABEL Antoine 82 23 44 12</para>
... <para>ABEL Pierre 82 67 23 12</para>
... </chapter>
... </book>
... <book><title>Yellow pages 2007</title>
... <author>
... <surname>La Poste</surname>
... </author>
... <chapter><title>Bretagne</title>
... <para>Zindep 82 23 44 12</para>
... <para>ZYM 82 67 23 12</para>
... </chapter>
... </book>
... <book><title>Dark pages 2007</title>
... <author>
... <surname>La Poste</surname>
... </author>
... <chapter><title>Greves</title>
... <para>SNCF 82 23 44 12</para>
... </chapter>
... </book>
... </infos>
... '''

>>> new='''<?xml version="1.0"?>
... <infos>
... <book><title>White pages 1995</title>
... <author>
... <surname>La Poste</surname>
... </author>
... <chapter><title>Paris</title>
... <para>ABIL Antoine 82 23 44 12</para>
... <para>ABEL Pierre 82 67 23 12</para>
... </chapter>
... </book>
... <book><title>Yellow pages 2007</title>
... <author>
... <surname>La Poste</surname>
... </author>
... <chapter><title>Bretagne</title>
... <para>Zindep 82 23 44 12</para>
... <para>ZYM 82 67 23 12</para>
... </chapter>
... </book>
... <book><title>Blue pages 2007</title>
... <author>
... <surname>La Poste</surname>
... </author>
... <chapter><title>Bretagne</title>
... <para>Mer 82 23 44 12</para>
... <para>Ciel 82 67 23 12</para>
... </chapter>
... </book>
... </infos>
... '''

解析XML需要elementTree

>>> from elementtree import ElementTree as ET

对于此测试，我们将使用cstringio而不是文件

>>> import cStringIO
>>> ex_old = cStringIO.StringIO(old)
>>> ex_new = cStringIO.StringIO(new)

我们分析内容

>>> root_old = ET.parse(ex_old).getroot()
>>> root_new = ET.parse(ex_new).getroot()

“book”标签标识我们想要的对象 >>>>对象旧=根旧。findall（'book'） >>>>objects_new=root_new.findall（'book'）

由于无法比较两个对象，因此我们将它们串起来

>>> objects_old = [ET.tostring(o) for o in objects_old]
>>> objects_new = [ET.tostring(o) for o in objects_new]

从这里开始，比较器可用

>>> my_comp = Comparator(objects_old, objects_new)
>>> my_comp.check()

>>> for e in my_comp.additions:
...     print e
...
<book><title>White pages 1995</title>
<author>
<surname>La Poste</surname>
</author>
<chapter><title>Paris</title>
<para>ABIL Antoine 82 23 44 12</para>
<para>ABEL Pierre 82 67 23 12</para>
</chapter>
</book>
<BLANKLINE>
<book><title>Blue pages 2007</title>
<author>
<surname>La Poste</surname>
</author>
<chapter><title>Bretagne</title>
<para>Mer 82 23 44 12</para>
<para>Ciel 82 67 23 12</para>
</chapter>
</book>
<BLANKLINE>

>>> for e in my_comp.deletions:
...     print e
...
<book><title>White pages 1995</title>
<author>
<surname>La Poste</surname>
</author>
<chapter><title>Paris</title>
<para>ABEL Antoine 82 23 44 12</para>
<para>ABEL Pierre 82 67 23 12</para>
</chapter>
</book>
<BLANKLINE>
<book><title>Dark pages 2007</title>
<author>
<surname>La Poste</surname>
</author>
<chapter><title>Greves</title>
<para>SNCF 82 23 44 12</para>
</chapter>
</book>
<BLANKLINE>

我们需要知道wich标记是用来唯一定义一个对象的在这里，我们选择使用“title”标记

>>> def item_signature(xml_element):
...     title = xml_element.find('title')
...     return title.text
...

我们构建自定义函数以供比较器使用

>>> def my_key(str):
...     file_like = cStringIO.StringIO(str)
...     root = ET.parse(file_like)
...     return item_signature(root)
...

然后比较器的getchanges方法可用

>>> my_comp.getChanges(my_key, purge=True)

哪些书被独家添加？

>>> for e in my_comp.additions:
...     print e
...
<book><title>Blue pages 2007</title>
<author>
<surname>La Poste</surname>
</author>
<chapter><title>Bretagne</title>
<para>Mer 82 23 44 12</para>
<para>Ciel 82 67 23 12</para>
</chapter>
</book>
<BLANKLINE>

哪些书被完全删除了？

>>> for e in my_comp.deletions:
...     print e
...
<book><title>Dark pages 2007</title>
<author>
<surname>La Poste</surname>
</author>
<chapter><title>Greves</title>
<para>SNCF 82 23 44 12</para>
</chapter>
</book>
<BLANKLINE>

什么书变了？即具有相同的标题，但其他值不同

>>> for e in my_comp.changes:
...     print e
...
<book><title>White pages 1995</title>
<author>
<surname>La Poste</surname>
</author>
<chapter><title>Paris</title>
<para>ABIL Antoine 82 23 44 12</para>
<para>ABEL Pierre 82 67 23 12</para>
</chapter>
</book>
<BLANKLINE>

然后我们可以将这些结果放回xml文件中

此代码符合PEP8
它经过全面测试，100%覆盖
buildbot在每次提交时运行测试

Contributors

主要开发人员
nicolas laurance<；nlaurance在zindep dot com>；
with contributions of
Yves Mahe<；Ymahe在zindep dot com>；

Change history

在0.1中新建
首次发布

欢迎加入QQ群-->： 979659372

ListComparator 0.1

ListComparator的Python项目详细描述

Detailed Documentation

XML and CSV comparisons

List comparison

compare XML files

Contributors

主要开发人员
nicolas laurance<；nlaurance在zindep dot com>；
with contributions of
Yves Mahe<；Ymahe在zindep dot com>；

with contributions of

Change history

在0.1中新建
首次发布

推荐PyPI第三方库

uavsim

cinq-auditor-domain-hijacking

sdssdb

ploomcake.core

prestapyt

dxlconsole

pytrack-analysis

fio_email_queue

ibm-db-sa

pygcgen

ranger

deduplication

supermjo-p

vthread

rc-ssl-logtools

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

ListComparator 0.1

ListComparator的Python项目详细描述

Detailed Documentation

XML and CSV comparisons

List comparison

compare XML files

Contributors

主要开发人员 nicolas laurance<；nlaurance在zindep dot com>；with contributions ofYves Mahe<；Ymahe在zindep dot com>；

with contributions of

Change history

在0.1中新建 首次发布

推荐PyPI第三方库

uavsim

cinq-auditor-domain-hijacking

sdssdb

ploomcake.core

prestapyt

dxlconsole

pytrack-analysis

fio_email_queue

ibm-db-sa

pygcgen

ranger

deduplication

supermjo-p

vthread

rc-ssl-logtools

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

主要开发人员
nicolas laurance<；nlaurance在zindep dot com>；
with contributions of
Yves Mahe<；Ymahe在zindep dot com>；

在0.1中新建
首次发布

导航栏

项目链接

标签