对管道进行分组和合计

2024-06-25 07:02:56 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一张看起来很可笑的单子。你知道吗

[['Biking', '10'], ['Biking|Gym', '14'], ['Biking|Gym|Hiking', '9'], ['Biking|Gym|Hiking|Running', '27']]

我想把它转换成['Type',total,%]的格式,如下所示:

[['Biking',60,'34.7%'],['Gym',50,'28.9%'],['Hiking',36,'20.8%'],['Running',27,'15.6%']]

我确信我正在用最困难的方法来做这件事-有人能给我指出一个更好的方向吗?我有一个用户itertools.groupby组这似乎是一个很好的地方,但我不确定如何在这个场景中实现。你知道吗

# TODO: This is totally ridiculous.
running = 0
hiking = 0
gym = 0
biking = 0
no_exercise = 0

for r in exercise_types_l:
    if 'Running' in r[0]:
        running += int(r[1])
    if 'Hiking' in r[0]:
        hiking += int(r[1])
    if 'Gym' in r[0]:
        gym += int(r[1])
    if 'Biking' in r[0]:
        biking += int(r[1])
    if 'None' in r[0]:
        no_exercise += int(r[1])

total = running + hiking + gym + biking + no_exercise

l = list()
l.append(['Running', running, '{percent:.1%}'.format(percent=running/total)])
l.append(['Hiking', hiking, '{percent:.1%}'.format(percent=hiking/total)])
l.append(['Gym', gym, '{percent:.1%}'.format(percent=gym/total)])
l.append(['Biking', biking, '{percent:.1%}'.format(percent=biking/total)])
l.append(['None', no_exercise, '{percent:.1%}'.format(percent=no_exercise/total)])

l = sorted(l, key=lambda r: r[1], reverse=True)

Tags: noinifrunninginttotalgympercent
3条回答

也许是这样的(注:你可以用集合.defaultdict默认值为0,而不是使用数据.get什么?你知道吗

sum=0
data={}
for extype, value in exercise_types_1:
   for item in extype.split('|'):
       sum += value
       data[item]=data.get(item,0)+value
l=[]
for k,v in data.iteritems():
   l.append([k,v, '{percent:.1%}'.format(percent=v/sum)])

l=sorted(l, key=lambda r: r[1], reverse=True)

给出了一个初始列表,比如

>>> test_list = [['Biking', '10'], ['Biking|Gym', '14'], ['Biking|Gym|Hiking', '9'], ['Biking|Gym|Hiking|Running', '27']]

您可以首先构造一个defaultdict来对值进行求和(得到最终结果的第二个元素),比如

>>> from collections import defaultdict
>>> final_dict = defaultdict(int)
>>> for keys, values in test_list:
        for elem in keys.split('|'):
            final_dict[elem] += int(values)


>>> final_dict
defaultdict(<type 'int'>, {'Gym': 50, 'Biking': 60, 'Running': 27, 'Hiking': 36})

然后,你可以使用列表理解得到最终结果。你知道吗

>>> final_sum = float(sum(final_dict.values()))
>>> [(elem, num, str(num/final_sum)+'%') for elem, num in final_dict.items()]
[('Gym', 50, '0.28901734104%'), ('Biking', 60, '0.346820809249%'), ('Running', 27, '0.156069364162%'), ('Hiking', 36, '0.208092485549%')]

因为,您希望对它们进行排序和格式化,并将最终结果更改为。你知道吗

>>> [(elem, num, '{:.1%}'.format(num/final_sum)) for elem, num in final_dict.items()]
[('Gym', 50, '28.9%'), ('Biking', 60, '34.7%'), ('Running', 27, '15.6%'), ('Hiking', 36, '20.8%')]
>>> from operator import itemgetter
>>> sorted([(elem, num, '{:.1%}'.format(num/final_sum)) for elem, num in final_dict.items()], key = itemgetter(1), reverse=True)
[('Biking', 60, '34.7%'), ('Gym', 50, '28.9%'), ('Hiking', 36, '20.8%'), ('Running', 27, '15.6%')]

您可以在这里使用collections.defaultdict。dict在这里是更好的数据结构,因为您可以访问与O(1)类型中的任何'Type'相关的值。你知道吗

>>> from collections import defaultdict
>>> lis = [['Biking', '10'], ['Biking|Gym', '14'], ['Biking|Gym|Hiking', '9'],      ['Biking|Gym|Hiking|Running', '27']]
>>> total = 0
>>> dic  = defaultdict(lambda :[0])
for keys, val in lis:
    keys = keys.split('|')
    val = int(val)
    total += val*len(keys)
    for k in keys:
        dic[k][0] += val
...         
for k,v in dic.items():
    dic[k].append(format(v[0]/float(total), '.2%'))
...     
>>> dic
defaultdict(<function <lambda> at 0xb60e772c>,
{'Gym': [50, '28.90%'],
 'Biking': [60, '34.68%'],
 'Running': [27, '15.61%'],
 'Hiking': [36, '20.81%']})

访问值:

>>> dic['Biking']
[60, '34.68%']
>>> dic['Hiking']
[36, '20.81%']

另一种选择是使用dict作为值而不是列表:

>>> dic = defaultdict(lambda :dict(val = 0))
>>> total = 0
for keys, val in lis:
    keys = keys.split('|')
    total += int(val)*len(keys)
    for k in keys:
        dic[k]['val'] += int(val)
...         
for k,v in dic.items():
    dic[k]['percentage'] = format(v['val']/float(total), '.2%')
...     
>>> dic
defaultdict(<function <lambda> at 0xb60e7b8c>, 
{'Gym': {'percentage': '28.90%', 'val': 50},
 'Biking': {'percentage': '34.68%', 'val': 60},
 'Running': {'percentage': '15.61%', 'val': 27},
 'Hiking': {'percentage': '20.81%', 'val': 36}})

访问值:

#Return percentage related to 'Gym'
>>> dic['Gym']['percentage']
'28.90%'
#return the total sum of 'Biking'
>>> dic['Biking']['val']
60

相关问题 更多 >