创建最小连通有向无环图

2024-09-29 23:22:22 发布

您现在位置:Python中文网/ 问答频道 /正文

我在NetworkX中有一个有向无环简单图。在

现在,对于每个边,该边有一个“源”和一个“目标”。如果除了这条边之外还有一条从“源”到“目标”的路径,那么我想删除这条边。在

  1. NetworkX是否有一个内置函数来执行此操作?在

我真的不想再发明轮子。在

  1. [可选]只有在1的答案是“否”的情况下,实现这一点的最有效算法是什么(对于一个相当密集的图)?在

以下是需要清洁的DAG示例:

  • 节点包括:

    ['termsequence', 'maximumdegree', 'emptymultigraph', 'minimum', 'multiset', 'walk', 'nonemptymultigraph', 'euleriantrail', 'nonnullmultigraph', 'cycle', 'loop', 'abwalk', 'endvertices', 'simplegraph', 'vertex', 'multipletrails', 'edge', 'set', 'stroll', 'union', 'trailcondition', 'nullmultigraph', 'trivialmultigraph', 'sequence', 'multiplepaths', 'path', 'degreevertex', 'onedgesonvertices', 'nontrivialmultigraph', 'adjacentedges', 'adjacentvertices', 'simpleedge', 'maximum', 'multipleloops', 'length', 'circuit', 'class', 'euleriangraph', 'incident', 'minimumdegree', 'orderedpair', 'unique', 'closedwalk', 'multipleedges', 'pathcondition', 'multigraph', 'trail']
    
  • 边缘是:

    [('termsequence', 'endvertices'), ('emptymultigraph', 'nonemptymultigraph'), ('minimum', 'minimumdegree'), ('multiset', 'trailcondition'), ('multiset', 'pathcondition'), ('multiset', 'multigraph'), ('walk', 'length'), ('walk', 'closedwalk'), ('walk', 'abwalk'), ('walk', 'trail'), ('walk', 'endvertices'), ('euleriantrail', 'euleriangraph'), ('loop', 'simplegraph'), ('loop', 'degreevertex'), ('loop', 'simpleedge'), ('loop', 'multipleloops'), ('endvertices', 'abwalk'), ('vertex', 'adjacentvertices'), ('vertex', 'onedgesonvertices'), ('vertex', 'walk'), ('vertex', 'adjacentedges'), ('vertex', 'multipleedges'), ('vertex', 'edge'), ('vertex', 'multipleloops'), ('vertex', 'degreevertex'), ('vertex', 'incident'), ('edge', 'adjacentvertices'), ('edge', 'onedgesonvertices'), ('edge', 'multipleedges'), ('edge', 'simpleedge'), ('edge', 'adjacentedges'), ('edge', 'loop'), ('edge', 'trailcondition'), ('edge', 'pathcondition'), ('edge', 'walk'), ('edge', 'incident'), ('set', 'onedgesonvertices'), ('set', 'edge'), ('union', 'multiplepaths'), ('union', 'multipletrails'), ('trailcondition', 'trail'), ('nullmultigraph', 'nonnullmultigraph'), ('sequence', 'walk'), ('sequence', 'endvertices'), ('path', 'cycle'), ('path', 'multiplepaths'), ('degreevertex', 'maximumdegree'), ('degreevertex', 'minimumdegree'), ('onedgesonvertices', 'multigraph'), ('maximum', 'maximumdegree'), ('circuit', 'euleriangraph'), ('class', 'multiplepaths'), ('class', 'multipletrails'), ('incident', 'adjacentedges'), ('incident', 'degreevertex'), ('incident', 'onedgesonvertices'), ('orderedpair', 'multigraph'), ('closedwalk', 'circuit'), ('closedwalk', 'cycle'), ('closedwalk', 'stroll'), ('pathcondition', 'path'), ('multigraph', 'euleriangraph'), ('multigraph', 'nullmultigraph'), ('multigraph', 'trivialmultigraph'), ('multigraph', 'nontrivialmultigraph'), ('multigraph', 'emptymultigraph'), ('multigraph', 'euleriantrail'), ('multigraph', 'simplegraph'), ('trail', 'path'), ('trail', 'circuit'), ('trail', 'multipletrails')]
    

Tags: pathloopwalkvertextrailedgeincidentmultiset
3条回答

这是一个简单的通用算法。算法可以向前或向后运行。这个和乔尔的答案基本上是双重的,他向后跑,这个向前跑:

def remove_redundant_edges(G):
    """
    Remove redundant edges from a DAG using networkx (nx).
    An edge is redundant if there is an alternate path
    from its start node to its destination node.

    This algorithm could work front to back, or back to front.
    We choose to work front to back.

    The main persistent variable (in addition to the graph
    itself) is indirect_pred_dict, which is a dictionary with
    one entry per graph node.  Each entry is a set of indirect
    predecessors of this node.

    The algorithmic complexity of the code on a worst-case
    fully-connected graph is O(V**3), where V is the number
    of nodes.
    """

    indirect_pred_dict = collections.defaultdict(set)
    for node in nx.topological_sort(G):
        indirect_pred = indirect_pred_dict[node]
        direct_pred = G.predecessors(node)
        for pred in direct_pred:
            if pred in indirect_pred:
                G.remove_edge(pred, node)
        indirect_pred.update(direct_pred)
        for succ in G.successors(node):
            indirect_pred_dict[succ] |= indirect_pred

复杂性分析与大O优化

对于一个最小连通图,其中每个节点只连接到一条边,其复杂性为O(V+E)。然而,即使对于一个简单的线性图(其中每个节点都有一个传入边和一个传出边),复杂度也是O(V*E),而对于一个最大连通图(这是最坏的情况,其中每个节点都连接到图上的每个下游节点),复杂性也是O(V**3)。对于这种情况,ops的数量遵循序列A000292,这是n * (n+1) * (n+2) / 6,其中n是节点数(V)减去3。在

根据图形的形状,还可以进行其他优化。下面是一个带有几种不同优化器的版本,它们可以显著降低某些类型图形的复杂性和运行时:

^{pr2}$

我还没有分析是否可以构造一个稠密但非最大连通的图,该图的执行复杂度大于O(V**2),并且启用了优化密集选项,但我没有理由先验地相信这是不可能的。优化对最大连通图最有效,并且不会做任何事情,例如,在每个节点与其子节点共享继承者而不是子节点的情况下,我还没有分析这种情况下的运行时。在

示例试验台

我已经剥离了基本算法的代码,添加了记录最坏情况路径所需操作数的工具,以及生成最大连接图的示例测试生成器。在

import collections
import networkx as nx

def makegraph(numnodes):
    """
    Make a fully-connected graph given a number of nodes
    """
    edges = []
    for i in range(numnodes):
        for j in range(i+1, numnodes):
            edges.append((i, j))
    return nx.DiGraph(edges)

def remove_redundant_edges(G):
    ops = 0
    indirect_pred_dict = collections.defaultdict(set)
    for node in nx.topological_sort(G):
        indirect_pred = indirect_pred_dict[node]
        direct_pred = G.predecessors(node)
        for pred in direct_pred:
            if pred in indirect_pred:
                G.remove_edge(pred, node)
        indirect_pred.update(direct_pred)
        for succ in G.successors(node):
            indirect_pred_dict[succ] |= indirect_pred
            ops += len(indirect_pred)
    return ops

def test_1(f, numnodes):
    G = makegraph(numnodes)
    e1 = nx.number_of_edges(G)
    ops = f(G)
    e2 = nx.number_of_edges(G)
    return ops, e1, e2

for numnodes in range(30):
    a = test_1(remove_redundant_edges, numnodes)
    print numnodes, a[0]

是的。在

您需要使用all_simple_paths[documentation](它提供了一个生成器,包含了这两者之间的所有简单路径)。一旦找到第二个就退出,这样它就不会计算所有的值。在

定义后,您需要查看每个边,如果从源到目标有多个路径,则删除该边。在

def multiple_paths(G,source,target):
    '''returns True if there are multiple_paths, False otherwise'''
    path_generator = nx.all_simple_paths(G, source=source, target=target)
    counter = 0
    for path in path_generator: #test to see if there are multiple paths
        counter += 1
        if counter >1:  
            break  #instead of breaking, could have return True
    if counter >1:  #counter == 2
        return True
    else:  #counter == 0 or 1
        return False

import networkx as nx
G=nx.DiGraph()
G.add_edges_from([(0,1), (1,2), (1,3), (0,3), (2,3)])
multiple_paths(G,0,1)
> False
multiple_paths(G,0,2) 
> False
multiple_paths(G,0,3)
> True

for edge in G.edges_iter():  #let's do what you're trying to do
    if multiple_paths(G, edge[0], edge[1]):
        G.remove_edge(edge[0],edge[1])

G.edges()
> [(0, 1), (1, 2), (2, 3)]

注意,您也可以删除边缘,然后运行has_path来查看是否还有路径。如果没有,则将边添加回来。在

^{pr2}$

但是,如果有任何边缘数据,您需要小心,而且我不喜欢删除边缘然后将其添加回来的可能性-这会为一些难以找到的bug打开机会。在

我之前的回答特别涉及到一个直接的问题,即是否有一个好的方法来测试单个边缘是否冗余。在

看起来你真的需要一种方法来有效地移除所有冗余的边。这意味着你需要一种同时完成所有任务的方法。这是一个不同的问题,但这里有一个答案。我不认为networkx有什么内置的东西,但它不难找到一个可行的算法。在

这是因为它是一个DAG,所以有些节点没有出边。从它们开始并处理它们。一旦所有这些都被处理,那么它们的一个子集就没有未被处理的子代。看看那些父母。重复。在每个阶段,未处理的节点集是一个DAG,我们正在处理该DAG的“结束节点”。它保证完成(如果原始网络是有限的)。在

在实现中,每当我们处理一个节点时,我们首先检查是否有任何子节点也是间接子代。如果是,请删除边。如果没有,就别管它。当处理所有子对象时,我们通过将其所有子对象添加到父对象的间接子对象集中来更新父对象的信息。如果父对象的所有子对象都被处理,我们现在将其添加到下一次迭代的列表中。在

import networkx as nx
from collections import defaultdict

def remove_redundant_edges(G):
    processed_child_count = defaultdict(int)  #when all of a nodes children are processed, we'll add it to nodes_to_process
    descendants = defaultdict(set)            #all descendants of a node (including children)
    out_degree = {node:G.out_degree(node) for node in G.nodes_iter()}
    nodes_to_process = [node for node in G.nodes_iter() if out_degree[node]==0] #initially it's all nodes without children
    while nodes_to_process:
        next_nodes = []
        for node in nodes_to_process:
            '''when we enter this loop, the descendants of a node are known, except for direct children.'''
            for child in G.neighbors(node):
                if child in descendants[node]:  #if the child is already an indirect descendant, delete the edge
                    G.remove_edge(node,child)
                else:                                    #otherwise add it to the descendants
                    descendants[node].add(child)
            for predecessor in G.predecessors(node):             #update all parents' indirect descendants
                descendants[predecessor].update(descendants[node])  
                processed_child_count[predecessor]+=1            #we have processed one more child of this parent
                if processed_child_count[predecessor] == out_degree[predecessor]:  #if all children processed, add to list for next iteration.
                    next_nodes.append(predecessor)
        nodes_to_process=next_nodes

要测试它:

^{pr2}$

这将是有效的:它必须在开始时处理每个节点,以确定它是否是“结束”节点。然后它处理到达这些边的每个边,并检查是否已处理该父对象的所有子对象。然后它看着那些父母,然后重复。在

因此,它将处理每条边一次(包括浏览父对象),每个顶点将在开始时处理一次,然后处理一次。在

相关问题 更多 >

    热门问题