python中比较两个列表并获得相同元素的算法

2024-09-30 08:32:38 发布

您现在位置:Python中文网/ 问答频道 /正文

我要列出一些常见的元素:

p = [('link1/d/b/c', 'target1/d/b/c'), ('link2/a/g/c', 'target2/a/g/c'), ..., ('linkn/b/b/f', 'targetn/b/b/f')]

q = [['target1/d/b/c', 'target1', 123, 334], ['targetn/b/b/f', 'targetn', 23, 64], ... ,['targetx/f/f/f', 'targetx', 999, 888]]

我试着比较它们,找出共同点,然后用结果做一些工作:

^{pr2}$

目前,我使用的是简单且非常缓慢的alghortim:

for item in p:
   link = item[0]
   target = item[1]
   for item2 in q:
       target2 = item2[0]
       if target2 == target:
           do_some_job(...)

我认为,我需要比较这两个列表,然后创建一个包含所有元素的列表,例如:

pq = [['target1/d/b/c', 'target1', 123, 334, 'link1/d/b/c'], ..., ['targetn/b/b/f', 'targetn', 23, 64, 'linkn/b/b/f']]

然后调用do_some_job(pq),而不是每次找到同一个元素时都调用它

如何获得?在

谨致问候


Tags: in元素targetforjobsomeitemdo
3条回答

使用chain()将这两个列表展平,然后使用set()和{}获取公共元素。在

In [78]: from itertools import chain

In [79]: p
Out[79]: 
[('link1/d/b/c', 'target1/d/b/c'),
 ('link2/a/g/c', 'target2/a/g/c'),
 ('linkn/b/b/f', 'targetn/b/b/f')]

In [80]: q
Out[80]: 
[['target1/d/b/c', 'target1', 123, 334],
 ['targetn/b/b/f', 'targetn', 23, 64],
 ['targetx/f/f/f', 'targetx', 999, 888]]

In [81]: set(chain(*p)).intersection(set(chain(*q)))
Out[81]: set(['target1/d/b/c', 'targetn/b/b/f'])

或者使用列表理解和短路:

^{pr2}$

或使用any()

In [87]: [j for i in p for j in i if any (j==z for y in q for z in y)]
Out[87]: ['target1/d/b/c', 'targetn/b/b/f']

计时

In [93]: %timeit set(chain(*p)).intersection(set(chain(*q)))
100000 loops, best of 3: 7.38 us per loop                     ##  winner

In [94]: %timeit [j for i in p for j in i if j in (z for y in q for z in y)]
10000 loops, best of 3: 24.9 us per loop

In [95]: %timeit [j for i in p for j in i if any (j==z for y in q for z in y)]
10000 loops, best of 3: 27.4 us per loop

In [97]: %timeit [x for x in chain(*p) if x in chain(*q)]
10000 loops, best of 3: 12.6 us per loop

使用chain进行列表理解应该有效:

[x for x in chain(*p) if x in chain(*q)]

你应该使用字典:

target_to_link = dict((v,k) for (k,v) in p)
for item in q:
    args = item + [target_to_link[item[0]]
    do_some_job(*args)

target_to_link字典提供来自目标的相应链接。只要确保你没有几个目标共享同一个链接。。。在

for循环中,我们只需创建一个临时的参数列表args,它将你的item(例如,['target1/d/b/c', 'target1', 123, 334])与相应的链接相结合,我们使用function(*args)语法。。。在


如果需要在p上循环,可以构造一个类似于

^{pr2}$

然后做一些类似的事情

for (link, target) in p:
    args = [target] + target_to_args[target] + [link]
    do_some_job(*args)

相关问题 更多 >

    热门问题