将输入文件的内容读入由id variab键控的字典

2024-09-28 05:22:33 发布

您现在位置:Python中文网/ 问答频道 /正文

我有这样一个sampleLabs1.txt文件(它有很多记录,所以我只列出5行):

visitid cdate ctime pqno测试结果单位范围

Omiojh8xeeq7152 6/15/2007 06:00 1181913408344759创建0.8毫克/分升0.5-1.4 OMHioJh8XEeq7152 6/14/2007 07:10 1181827489130119生成0.8毫克/分升0.5-1.4 Omiojh8xeeq7152 6/11/2007 14:21 1181592540465036创建2.9毫克/分升0.5-1.4 t2v0TjgroLTI6118 4/28/2006 14:18 114625776752828282产生8.7 mg/dL 0.5-1.4 t2v0TjgroLTI6118 5/1/2006 04:00 1146487572667772创建8.0 mg/dL 0.5-1.4

我想将输入文件的内容读入一个由“visitid”键控的字典,也就是说,我想要如下内容:

{OMHioJh8XEeq7152:6/15/2007,06:00,1181913408344759,CREAT,0.8,mg/dL,0.5-1.4, OMHioJh8XEeq7152:6/14/2007,07:10,1181827489130119,肌酐,0.8,毫克/分升,0.5-1.4, OMHioJh8XEeq7152:6/11/2007,14:211181592540465036,肌酐,2.9,毫克/分升,0.5-1.4, t2v0TjgroLTI6118:4/28/2006,14:18,114625776752828282,肌酐,8.7,毫克/分升,0.5-1.4, t2v0TjgroLTI6118:5/1/2006,04:00,1146487572667772,肌酐,8.0,mg/dL,0.5-1.4}

我编写了以下程序:

import os
newdict = {}
with open(os.path.join("..","c:\work\python programming","sampleLabs1.txt"),"rU") as f:
    for line in f:
        splitLine = line.split()
        newdict[(splitLine[0])] = ",".join(splitLine[1:])
newdict

但是,它确实给了我一个字典,但是它似乎覆盖了每个键的前一个记录“visitid”,并且只保留了一个唯一键(“visitid”)。也就是说,我得到了这样的东西:

{OMHioJh8XEeq7152:6/15/2007,06:00,1181913408344759,CREAT,0.8,mg/dL,0.5-1.4, t2v0TjgroLTI6118:5/1/2006,04:00,1146487572667772,肌酐,8.0,mg/dL,0.5-1.4}

但我想保留每个“visitid”指定的所有记录,例如:

{OMHioJh8XEeq7152:6/15/2007,06:00,1181913408344759,CREAT,0.8,mg/dL,0.5-1.4, OMHioJh8XEeq7152:6/14/2007,07:10,1181827489130119,肌酐,0.8,毫克/分升,0.5-1.4, OMHioJh8XEeq7152:6/11/2007,14:211181592540465036,肌酐,2.9,毫克/分升,0.5-1.4, t2v0TjgroLTI6118:4/28/2006,14:18,114625776752828282,肌酐,8.7,毫克/分升,0.5-1.4, t2v0TjgroLTI6118:5/1/2006,04:00,1146487572667772,肌酐,8.0,mg/dL,0.5-1.4}

我会感谢你的帮助,有人能帮我修改代码吗?谢谢大家的帮助。你知道吗


Tags: 文件txt内容记录dlmgsplitlinenewdict
2条回答
from collections import defaultdict, namedtuple
import os

WORKDIR = "c:\work\python programming"

Datum = namedtuple('Datum', ['visitid', 'cdate', 'ctime', 'pqno', 'test', 'result', 'unit', 'range'])

def load_data(fname):
    fname = os.path.join(WORKDIR, fname)
    with open(fname, 'rU') as inf:
        data = (Datum(*(line.split())) for line in inf)
        newdict = defaultdict(list)
        for d in data:
            newdict[d.visitid].append(d)
    return newdict

def main():
    data = load_data('sampleLabs1.txt')
    # now do something with it

if __name__=="__main__":
    main()

如果您的计划是分析visitid下的所有条目,或者比较visitid之间的平均值,等等,那么您可能希望将其视为一个数据库表。pandas软件包适用于此:

import pandas
nd = pandas.read_csv('sampleLabs1.txt',sep=' ')
unique(nd['visitid'])  # all visitid values
nd[nd['visitid'] == 'OMHioJh8XEeq7152']['cdate'] # all cdates for a given visitid

要使用字典,需要将每个visitid的值设为某种元组,如Hugh Bothwell的示例所示。你知道吗

相关问题 更多 >

    热门问题