如何在python中向单个唯一键插入多个值

2024-10-04 09:27:34 发布

您现在位置:Python中文网/ 问答频道 /正文

我有df,对于第一列,我想将第二列的值捕获为list,并创建一个列表,其中key是第一列的唯一id,第二列的值列表是values。你知道吗

以下是示例:

Bra100001       Bra100001
Bra100001       Bra011864
Bra100002       Bra011842
Bra100002       Bra100002
Bra100003       Bra100003
Bra100004       Bra100004
Bra100005       Bra100005
Bra100006       Bra100006
Bra100007       Bra011656
Bra100008       Bra100007
Bra100009       Bra100008
Bra100009       Bra011638
Bra100010       Bra103178
Bra100010       Bra011635

我需要的输出是

Bra100001:(Bra100001,Bra011864),Bra100002:(Bra011842,Bra100002),Bra100003:Bra100003 and so on....

这是我的密码

with open("test_blast.txt", 'r') as fh_in:
        prev = None
        result = {}
        for line in fh_in:
            line = line.strip()
            line = line.split()
            if prev == line[0]:
                result[line[0]] = line[1]
            prev = line[0]

Tags: in列表lineprevbra100009bra100006bra100004bra100001
3条回答

如果希望输出是元组有序字典:

import collections

result = collections.OrderedDict()

with open("test_blast.txt", 'r') as fh_in:
    for line in fh_in:
        col = line.split()
        if len(col)<2: continue
        if not col[0] in result:
            result[col[0]] = ()     
        result[col[0]] += (col[1],)

如果希望输出为示例中给出的精确格式的字符串,可以通过以下方式进一步处理结果:

out=[]
for r in result:
    s  = str(result[r]).replace(', ', ',').replace("'",'')
    if s.endswith(",)"): s = s[1:-2]
    out.append(r+':'+s)

print ",".join(out)

您可以在这里看到演示:http://repl.it/46l/3

有几个选项,只是简单的元组或列表,x = ('x1', 'x2'),或者将哈希与桶一起使用:

foobar = {
    'key1': [(some_vals)];
    'key2': [(some_other_vals)]
    ...
    }

那就参考foobar吧

with open('test_blast.txt', 'r') as f:
    lines = f.readlines()

records = [l.split() for l in lines]
records = [r for r in records if len(r) == 2]  # drop empty line at end in my tests

result = {}
for r in records:
    if not result.get(r[0]):
        result[r[0]] = []    # this is the first reference to key so initialize value
    result[r[0]].append(r[1])

# below is only needed for sorted output
keys = sorted(result.keys())
for k in keys:
    print k, ': ', result[k]

相关问题 更多 >