比较两部词典的效率

import collections def uniq(input): output = [] for x in input: if x not in output: output.append(x) return output Su = {} with open ('Sucrose_rivacombined.txt') as f: for line in f: (key, val) = line.split('\t') Su[(key)] = val Su_OD = collections.OrderedDict(Su) Su_keys = Su_OD.keys() Et = {} with open ('Ethanol_rivacombined.txt') as g: for line in g: (key, val) = line.split('\t') Et[(key)] = val Et_OD = collections.OrderedDict(Et) Et_keys = Et_OD.keys() merged_keys = Su_keys + Et_keys merged_keys = uniq(merged_keys) d3=collections.OrderedDict() output_doc = open("compare.txt","w+") for chr_local in merged_keys: line_output = chr_local if (Et.has_key(chr_local)): line_output = line_output + "\t" + Et[chr_local] else: line_output = line_output + "\t" + "ND" if (Su.has_key(chr_local)): line_output = line_output + "\t" + Su[chr_local] else: line_output = line_output + "\t" + "ND" output_doc.write(line_output + "\n")

3条回答

网友

1楼 · 编辑于 2024-09-28 18:59:34

你不需要你独特的功能。

伪代码如下：

按顺序读取文件2
进程文件1写出它的项目（已正确订购）
pop，其中defalut来自文件2，用于输出行的最后一部分
在使用文件1后，处理来自文件2的有序dict

此外，爱的清单理解…你可以阅读文件与：

OrderedDict(line.strip().split('\t') for line in open('Ethanol_rivacombined.txt'))

只有一个点的是dict和蔗糖_竞争组合.txt“甚至永远也记不起来。应该是超快的

编辑完成代码（不确定输出行格式）

^{pr2}$

网友

2楼 · 编辑于 2024-09-28 18:59:34

将uniq替换为以下内容，因为输入是可散列的：

def uniq(input):
  output = []
  s = set()
  for x in input:
    if x not in s:
      output.append(x)
      s.add(x)
  return output

这将把一个接近O(n^2)的进程减少到接近O(n)。在

网友

3楼 · 编辑于 2024-09-28 18:59:34

您的output是一个列表，但是您的输入是字典：它们的键保证是唯一的，但是您的not in output需要与列表中的每个元素进行比较，这是组合的。（由于not检查，您正在进行n^2个比较。）

您可能完全可以将uniq替换为：

Su_OD.update(Et_OD)

这对我有用：

^{pr2}$

输出：

    hello one
    world two
    cats two

相关问题更多 >

编程相关推荐

热门问题

热门文章