<p>这将为每个路径生成节点链列表。它使用python列表和字典进行散列-因此,如果处理大型数据集,速度会非常慢。如果是这种情况,请查看熊猫库(groupby)。同样的逻辑也会起作用,但在nodes_by_date函数中肯定会节省时间。该库中还可能有其他工具来生成所需的路径</p>
<pre><code>nodes = [('E',15,3.65), ('A',1,2.09), ('B',5,.89), ('C',8,3.17), ('D',8,1.15)]
#nodes = [('E',15,3.65), ('A',1,2.09), ('B',5,.89), ('C',8,3.17), ('D',8,1.15), ('F', 16, 100), ('G', 16, 200), ('H', 17, 1000)]
def all_paths(nodes):
nbd = nodes_by_date(nodes)
return get_chains(nbd, 0)
def nodes_by_date(nodes):
# sort by date (not sure what format your data is in, but might be easier to convert to a datenum/utc time)
nodes = sorted(nodes, key=lambda x: x[1])
# build a list of lists, each containing all of the nodes with a certain date
# if your data is larger, look into the pandas library's groupby operation
dates = {}
for n in nodes:
d = dates.get(n[1], [])
d.append(n)
dates[n[1]]=d
return list(dates.values())
def get_chains(nbd, i):
if i == len(nbd):
return []
# depth-first recursion so tails are only generated once
tails = get_chains(nbd, i+1)
chains = []
for n in nbd[i]:
if len(tails):
for t in tails:
newchain = [n]
# only performant for smaller data
newchain.extend(t)
chains.append(newchain)
# end of recursion
else:
chains.append([n])
return chains
</code></pre>