根据条件比较列表列表中的所有列表,并根据它们的不同

2024-09-19 23:42:11 发布

您现在位置:Python中文网/ 问答频道 /正文

我有以下清单:

a = [[1,2,3,4,5], [4,5,6,7,8], [1,2,3,4], [4,5,6,7,8,9], [2,3,4,5,6,7,8], [6,7,8,9], [5,6,7,8,9], [2,3,4,5,6], [3,4,5,6], [11,12,13,14,15], [13,14,15]]

为便于理解,用索引表示:

0 [1, 2, 3, 4, 5]
1 [4, 5, 6, 7, 8]
2 [1, 2, 3, 4]
3 [4, 5, 6, 7, 8, 9]
4 [2, 3, 4, 5, 6, 7, 8]
5 [6, 7, 8, 9]
6 [5, 6, 7, 8, 9]
7 [2, 3, 4, 5, 6]
8 [3, 4, 5, 6]
9 [11, 12, 13, 14, 15]
10 [13, 14, 15]

我希望输出是一个元组列表,如下所示:

output = [(0,2,1), (3,1,1), (4,7,2), (4,1,2), (6,5,1), (3,5,2), (3,6,1), (7,8,1), (9,10,2)]

For example to explain first item of output i.e, (0,2,1):

0 ---> index of list under comparison with highest length 
2 ---> index of list under comparison with lowest length
1 ---> difference in length of the two lists 0 & 2

现在,说到问题:

我有一些清单,清单的开头或结尾有一个和两个(或三个)长度不同的相似项目。你知道吗

我想排序,分组,识别列表的索引和它们作为元组的差异。你知道吗

我浏览了多个stackoverflow问题,但找不到类似的问题。你知道吗

我是python新手,从以下位代码开始,遇到了问题:

a = sorted(a, key = len)

incr = [list(g) for k, g in groupby(a, key=len)]

decr = list(reversed(incr))

ndecr = [i for j in decr for i in j]

for i in range(len(ndecr)-1):
    if len(ndecr[i]) - len(ndecr[i+1]) == 1:
        print(ndecr[i])

for i in range(len(ndecr)-2):
    if len(ndecr[i]) - len(ndecr[i+2]) == 2:
        print(ndecr[i])

for i in ndecr:
    ele = i
    ndecr.remove(i)
    for j in ndecr:
        if ele[:-1] == j:
            print(j)   

for i in ndecr:
    ele = i
    ndecr.remove(i)
    for j in ndecr:
        if ele[:-2] == j:
            print(i)

请帮助我采取什么方法来取得成果。你知道吗


Tags: ofin列表foroutputindexlenif
3条回答

编辑(原文如下):

现在,我可能会更好地理解你(感谢@vash_the_stampede的澄清评论)。这种方法嵌套了几个循环来比较列表列表中的每个列表,并确定其中一个列表是否是另一个列表的子集。然后,如果比较列表是超集/子集,它将创建一个元组的输出列表,每个元组包含两个比较列表的索引,这两个比较列表的索引顺序最长,并且这些比较列表的长度不同。你知道吗

重要提示:这种方法不比较列表顺序,因此它可以提供您可能不想要的输出,例如[1,2,4,5]是长度差为1的[1,2,3,4,5]的子集。或者,具体到您的示例,与示例输出相比,这种方法会输出一个额外的元组,因为索引8处的[3,4,5,6]是索引4处的[2,3,4,5,6,7,8]的子集,长度相差3。我认为来自@DSM的答案处理了这个问题,因此它可能更接近您的需要。你知道吗

当前数据集的输出示例:

a = [[1,2,3,4,5], [4,5,6,7,8], [1,2,3,4], [4,5,6,7,8,9], [2,3,4,5,6,7,8], [6,7,8,9], [5,6,7,8,9], [2,3,4,5,6], [3,4,5,6], [11,12,13,14,15], [13,14,15]]

output = []
for i in range(len(a)):
    for j in range(i + 1, len(a)):
       if set(a[i]).issubset(a[j]) or set(a[i]).issuperset(a[j]):
           diff = abs(len(a[i]) - len(a[j]))
           if len(a[i]) > len(a[j]):
               output.append((i, j, diff))
           else:
               output.append((j, i, diff))

print(output)

# OUTPUT
# [(0, 2, 1), (3, 1, 1), (4, 1, 2), (3, 5, 2), (3, 6, 1), (4, 7, 2), (4, 8, 3), (6, 5, 1), (7, 8, 1), (9, 10, 2)]

原件:

如果我理解正确,那么您可以嵌套几个循环来比较列表列表中的每个列表。然后,创建一个元组输出列表,每个元组包含两个比较列表的索引以及这些比较列表的长度差异。例如:

a = [[1,2,3,4,5], [4,5,6,7,8], [1,2,3,4], [4,5,6,7,8,9], [2,3,4,5,6,7,8], [6,7,8,9], [5,6,7,8,9], [2,3,4,5,6], [3,4,5,6], [11,12,13,14,15], [13,14,15]]

output = []
for i in range(len(a)):
    for j in range(i + 1, len(a)):
       diff = abs(len(a[i]) - len(a[j]))
       output.append((i, j, diff))

print(output)

# OUTPUT
# [(0, 1, 0), (0, 2, 1), (0, 3, 1), (0, 4, 2), (0, 5, 1), (0, 6, 0), (0, 7, 0), (0, 8, 1), (0, 9, 0), (0, 10, 2), (1, 2, 1), (1, 3, 1), (1, 4, 2), (1, 5, 1), (1, 6, 0), (1, 7, 0), (1, 8, 1), (1, 9, 0), (1, 10, 2), (2, 3, 2), (2, 4, 3), (2, 5, 0), (2, 6, 1), (2, 7, 1), (2, 8, 0), (2, 9, 1), (2, 10, 1), (3, 4, 1), (3, 5, 2), (3, 6, 1), (3, 7, 1), (3, 8, 2), (3, 9, 1), (3, 10, 3), (4, 5, 3), (4, 6, 2), (4, 7, 2), (4, 8, 3), (4, 9, 2), (4, 10, 4), (5, 6, 1), (5, 7, 1), (5, 8, 0), (5, 9, 1), (5, 10, 1), (6, 7, 0), (6, 8, 1), (6, 9, 0), (6, 10, 2), (7, 8, 1), (7, 9, 0), (7, 10, 2), (8, 9, 1), (8, 10, 1), (9, 10, 2)]

嗯,我相信这可以更有效地完成,但我所做的是创建原始列表的副本,其中每一个项目的两端都缩短了一到两个,然后比较这些项目,并返回相应长度的索引,它们的差异,它的工作,但它相当大,我会看到减少它

l1 = a[:]

tups = []
for idx, item in enumerate(l1):
    for x, i in enumerate(a):
        if sorted(item[:-1]) == sorted(i):
            tups.append((idx, x, 1))
        elif sorted(item[:-2]) == sorted(i):
            tups.append((idx, x, 2))
        elif sorted(item[1:]) == sorted(i):
            tups.append((idx, x, 1))
        elif sorted(item[2:]) == sorted(i):
            tups.append((idx, x, 2))

print(tups)
[(0, 2, 1), (3, 1, 1), (4, 7, 2), (3, 6, 1), (6, 5, 1), (7, 8, 1), (3, 5, 2), (4, 1, 2), (9, 10, 2)]

IIUC,假设列表的总数很小,因此len(lists)^2仍然很小,类似于

from itertools import combinations

# sort by length but preserve the index
ax = sorted(enumerate(a), key=lambda x: len(x[1]))

done = []

for (i0, seq0), (i1, seq1) in combinations(ax, 2):
    if seq1[:len(seq0)] == seq0 or seq1[-len(seq0):] == seq0:
       done.append((i1, i0, len(seq1)-len(seq0)))

给了我

In [117]: sorted(done)
Out[117]: 
[(0, 2, 1),
 (3, 1, 1),
 (3, 5, 2),
 (3, 6, 1),
 (4, 1, 2),
 (4, 7, 2),
 (6, 5, 1),
 (7, 8, 1),
 (9, 10, 2)]

它与您的输出相匹配,但顺序不同,而且您已经列出了(4,7,2)两次。你知道吗

seq1[:len(seq0)] == seq0 

是“seq1以seq0开头吗?”条件,以及

seq1[-len(seq0):] == seq0

是“seq1以seq0结尾吗?”条件。你知道吗

相关问题 更多 >