在Python中从CSV文件构建最短路径和关系列表

2024-06-26 10:51:18 发布

您现在位置:Python中文网/ 问答频道 /正文

我需要你的帮助,我写了一个不完整的代码来处理我的问题。在

所以我有这个输入文件:

-----INPUT : test2.csv-----
child, parent, relation
M3,Q,P
M54,M7,P
M54,M27,E
Q,M7,P
M7,Q,E
M7,M3,P
M27,Q,E

======OUTPUT REQUIRED====
M3->Q,P
M54->M7->Q,P
M7->Q,E
M27->Q,E

==============问题说明=================

Q是最终的父母。我想把所有的孩子都追溯到Q(到父节点Q的最短路径)。 例如,对于第一行,输出应为=

^{pr2}$

但第二行,M54是M7的子级,关系标记为'p'(M54->;M7,p),但我们需要遍历M7到最终父级,即'Q'。当我们沿着csv文件搜索M7的父对象时,我们可以从第5行和第6行看到,M7可以将“M3”作为其父对象,也可以将“Q”作为其父对象。所以我们有两条路径可以追溯到最终的父类Q:

M54->M7->Q,PE & M54->M7->M3->Q,PPP

但我们只希望有最短的路径,即

M54->M7->Q,PE 

另一个问题是我们也有循环路径,例如考虑第4行和第5行:

Q,M7,P
M7,Q,E

因此,在这种情况下,我们希望输出为M7->;Q,E(而不是人们预期的Q->;M7->;Q)。在

这是我目前为止想出的代码:

# Read the csv file
import csv
z=[]
with open('test2.csv', 'rb') as f:
    reader = csv.reader(f)
    for row in reader:
        z.append(row)        

# Now generate list-of-list to store relations. Eg [['M7', 'Q', ['P']],.....
for i in range(len(z)):
    t=z[i].pop()
    z[i].append([t])

# Now lets populate the list   
t=[]    
out=[]
for i in range(len(z)):
    child = z[i][0]
    if z[i][1]=='Q': #whos the parent ?
            out.append(z[i])
            continue
    t.append(z[i])
    index=t.index(z[i])
    print (index)
    parent=t[index][1]
    print (parent)
    length=len(t[index])
    for j in range(len(z)):
        if z[j][0] == child:
            if z[j][1]=='Q':
                t[index].insert(length,'Q')

#print (parent)
print ("temp=",t)
print ("out=",out)

Tags: csvingt路径childforindexlen
1条回答
网友
1楼 · 发布于 2024-06-26 10:51:18

这是一个基于this essay on Python.org的解决方案:

from pprint import pformat
import csv


def graph_from_csv(filename):
    """Turn a CSV of child, parent, relation entries into a graph represented
    by a dictionary, for example:

    {'A': [('B', 'ab'), ('C', 'ac')],
     'B': [('C','bc'), ('D','bd')],
     'C': [('D','cd')],
     'D': [('C','dc')]}
"""
    rows = []
    with open(filename, 'r') as f:
        reader = csv.reader(f)

        # Skip header row
        next(reader, None)

        for row in reader:
            rows.append(row)


    graph = dict()
    for row in rows:
        child, parent, relation = row
        if not parent in graph:
            graph[parent] = []
        graph[parent].append((child, relation))

    return graph


def find_shortest_path(graph, start, end, path=[], relations=[]):
    path = path + [start]
    if start == end:
        return path, relations
    if not start in graph:
        return (None, None)
    shortest = None
    shortest_rels = None
    for node, relation in graph[start]:
        if node not in path:
            newpath, newrelations = find_shortest_path(graph, node, end, path,
                                                       relations + [relation])
            if newpath:
                if not shortest or len(newpath) < len(shortest):
                    shortest_rels = newrelations
                    shortest = newpath
    return shortest, shortest_rels


def format_output(path, relations):
    p = '->'.join(path)
    r = ''.join(relations)
    return "%s,%s" % (p, r)


def main():
    graph = graph_from_csv('test2.csv')
    print("Graph:")
    print(pformat(graph))
    print()

    start, end = 'Q', 'M54'
    path, relations = find_shortest_path(graph, start, end)
    print("Shortest path from '%s' to '%s'" % (start, end))
    print(format_output(path, relations))
    print()

    start, end = 'Q', 'M7'
    path, relations = find_shortest_path(graph, start, end)
    print("Shortest path from '%s' to '%s'" % (start, end))
    print(format_output(path, relations))
    print()


if __name__ == '__main__':
    main()

输出:

^{pr2}$

您会注意到,与您的示例(从祖先到子对象)相比,路径查询和结果路径的方向是颠倒的-这正是本文中原始的find_shortest_path函数在有向图中实现父-子关系的方式-我坚持这样做,因为我觉得这是最自然的表达方式。在

但是,如果您确实想反转它,只需在相应的列表上使用^{}path和{})并切换start和{}就可以了。在


(1)我认为,这一行需要一些解释:

path = path + [start]

为什么不直接附加到path?这样做是因为path=[]是一个mutable default argument,如果您要附加到path,那么最后将更改对该函数的所有后续调用的默认参数。在

相关问题 更多 >