在元组列表中查找重复项Python问题的回答

在元组列表中查找重复项Python

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

更新：重读问题后，您似乎在尝试创建等价类，而不是收集键的值。如果 <pre><code>[[(1, 2), (3, 4), (2, 3)]] </code></pre> 应该变成 ^{pr2}$ ，然后需要将输入解释为图形并应用连接组件算法。您可以将数据结构转换为<a href="http://en.wikipedia.org/wiki/Adjacency_list" rel="nofollow">adjacency list</a>表示，并使用广度优先或深度优先搜索遍历它，或者遍历列表并构建<a href="http://en.wikipedia.org/wiki/Disjoint-set_data_structure" rel="nofollow">disjoint sets</a>。无论哪种情况，你的代码都会突然涉及大量与图相关的复杂性，而且很难根据输入的顺序提供任何输出排序保证。以下是基于广度优先搜索的算法： <pre><code>import collections # build an adjacency list representation of your input graph = collections.defaultdict(set) for l in ListA: for first, second in l: graph[first].add(second) graph[second].add(first) # breadth-first search the graph to produce the output output = [] marked = set() # a set of all nodes whose connected component is known for node in graph: if node not in marked: # this node is not in any previously seen connected component # run a breadth-first search to determine its connected component frontier = set([node]) connected_component = [] while frontier: marked |= frontier connected_component.extend(frontier) # find all unmarked nodes directly connected to frontier nodes # they will form the new frontier new_frontier = set() for node in frontier: new_frontier |= graph[node] - marked frontier = new_frontier output.append(tuple(connected_component)) </code></pre> 但是，不要只是在没有理解的情况下复制它；要理解它在做什么，或者编写自己的实现。你可能需要能够维持这个。（我会使用伪代码，但Python实际上已经和伪代码一样简单了。） 如果我对您问题的最初解释是正确的，并且您的输入是要聚合的键值对的集合，那么我的原始答案如下： 原始答案 <pre><code>import collections clusterer = collections.defaultdict(list) for l in ListA: for k, v in l: clusterer[k].append(v) output = clusterer.values() </code></pre> <code>defaultdict(list)</code>是一个<code>dict</code>，它自动创建一个<code>list</code>作为任何不存在的键的值。循环遍历所有元组，收集与同一个键匹配的所有值，然后从defaultdict创建一个（key，value_list）对列表。在 （此代码的输出与您指定的格式不完全相同，但我相信这种格式更有用。如果你想改变形式，那应该是件简单的事。）

在元组列表中查找重复项Python

1 个回答

相关Python问题