如何将二维数组拆分为具有唯一值和字典的数组？

# Original Data fileListCode = [['Seq3.xls', 'B08524_057'], ['Seq3.xls', 'B08524_053'], ['Seq3.xls', 'B08524_054'], ['Seq98.xls', 'B25034_001'], ['Seq98.xls', 'D25034_002'], ['Seq98.xls', 'B25034_003']]

tmpFileList = [] tmpCodeList = [] arrayListDict = [] # store unique filelist in a tempprary array: for i in range( len(fileListCode)): if fileListCode[i][0] not in tmpFileList: tmpFileList.append( fileListCode[i][0] )

3条回答

网友

1楼 · 编辑于 2024-09-24 00:23:44

你把单子和听写搞混了。在

这样做更有意义：

file_list_code = [['Seq3.xls', 'B08524_052'],
                  ['Seq3.xls', 'B08524_053'],                  
                  ['Seq3.xls', 'B08524_054'],                 
                  ['Seq98.xls', 'B25034_001'],                  
                  ['Seq98.xls', 'B25034_002'],                  
                  ['Seq98.xls', 'B25034_003']] 

file_codes = {}
for name, code in file_list_code:
    if name not in file_codes:
        file_codes[name] = []
    file_codes[name].append(code)

这就产生了：

^{pr2}$

这可以通过使用defaultdict进一步简化。对于这么简单的事情，可以说是太过分了，但是知道这一点是很好的。下面是一个例子：

import collections

file_list_code = [['Seq3.xls', 'B08524_052'],
                  ['Seq3.xls', 'B08524_053'],                  
                  ['Seq3.xls', 'B08524_054'],                 
                  ['Seq98.xls', 'B25034_001'],                  
                  ['Seq98.xls', 'B25034_002'],                  
                  ['Seq98.xls', 'B25034_003']] 

file_codes = collections.defaultdict(list)
for name, code in file_list_code:
    file_codes[name].append(code)

网友

2楼 · 编辑于 2024-09-24 00:23:44

fileListCode = [['Seq3.xls', 'B08524_052'],
                ['Seq3.xls', 'B08524_053'],
                ['Seq3.xls', 'B08524_054'],
                ['Seq98.xls', 'B25034_001'],
                ['Seq98.xls', 'B25034_002'],
                ['Seq98.xls', 'B25034_003']]

dico = {}
li = []
for a,b in fileListCode:

    if a in dico:
        li[dico[a]][1][b] = len( li[dico[a]][1] ) + 1


    else:
        dico[a] = len(li)
        li.append([a,{b:1}])


print '\n'.join(map(str,li))

网友

3楼 · 编辑于 2024-09-24 00:23:44

有了，itertools.groupby这个过程将更加简单：

>>> key = operator.itemgetter(0)
>>> grouped = itertools.groupby(sorted(fileListCode, key=key), key=key)
>>> [(i, {k[1]: n for n, k in enumerate(j, 1)}) for i, j in grouped]
[('Seq3.xls', {'B08524_052': 1, 'B08524_053': 2, 'B08524_054': 3}),
 ('Seq98.xls', {'B25034_001': 1, 'B25034_002': 2, 'B25034_003': 3})]

对于旧版Python：

^{pr2}$

但我认为使用dict会更好：

>>> {i: {k[1]: n for n, k in enumerate(j, 1)} for i, j in grouped}
{'Seq3.xls': {'B08524_052': 1, 'B08524_053': 2, 'B08524_054': 3},
 'Seq98.xls': {'B25034_001': 1, 'B25034_002': 2, 'B25034_003': 3}}

相关问题更多 >

编程相关推荐

热门问题

热门文章