用python列表代替dataframe

data = [ ["ANALYTICAL_BALANCE","BFG_DEPOSIT"], ["CUSTOMER_DETAIL","BALANCE"], ["BFG_2056", "FFD_15"], ["BALANCE","BFG_16"], ["BFG_16","STAT_HIST"], ["ANALYTICAL_BALANCE","BFG_2056"], ["CUSTOM_DATA","AND_11"], ["AND_11","DICT_DEAL"], ["DICT_DEAL","BFG_2056"] ]

def create_adj(edges): adj = {} # or use defaultdict(list) to avoid `if` in the loop below for a, b in edges: if not a in adj: adj[a] = [] if not b in adj: adj[b] = [] adj[a].append(b) return adj

def all_paths(adj): def recur(path): node = path[-1] neighbors = [neighbor for neighbor in adj[node] if not neighbor in path] if not neighbors: yield path for neighbor in neighbors: yield from recur(path + [neighbor]) for node in adj: yield from recur([node])

adj = create_adj(data) paths = all_paths(adj) for i in paths: if len(i) > 1: print(i) output: ['ANALYTICAL_BALANCE', 'BFG_DEPOSIT'] ['ANALYTICAL_BALANCE', 'BFG_2056', 'FFD_15'] ['CUSTOMER_DETAIL', 'BALANCE', 'BFG_16', 'STAT_HIST'] ['BALANCE', 'BFG_16', 'STAT_HIST'] ['BFG_2056', 'FFD_15'] ['BFG_16', 'STAT_HIST'] ['CUSTOM_DATA', 'AND_11', 'DICT_DEAL', 'BFG_2056', 'FFD_15'] ['AND_11', 'DICT_DEAL', 'BFG_2056', 'FFD_15'] ['DICT_DEAL', 'BFG_2056', 'FFD_15']

1条回答

网友

1楼 · 发布于 2024-05-06 16:59:13

使用熊猫不会加快您的速度，或者至少不会显著加快，因为熊猫数据帧旨在处理表格数据，并且没有针对数据的网络结构进行优化

使用networkx肯定会简化您的代码，但我担心它也不会显著提高您的速度，因为此库与您的代码一样，是由纯python构建的

通过使用其他fast库，例如turicreate中的igraph或SGraph对象，您将获得更好的性能。对于这两个库，一个需要习惯于使用，但这两个库肯定会为您加快速度

另一种加速的方法当然是找到一个适合你的问题的、更快的算法

现在，为了更实际，您实际要做的是生成图的所有可能的生成树。也许here你可以找到一些很好的参考，为一个合适的算法，将实现你想要的在较少的步骤

还发现this 其中包括一个答案，建议找到所有最小生成树的代码。您可以看到它如何创建networkx的Graph对象并使用函数。然后你可以用同样的权重搜索所有的边，它会在图中找到所有的生成树

相关问题更多 >

编程相关推荐

热门问题

热门文章