运行T创建嵌套字典

2024-09-28 15:24:57 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试创建一个嵌套字典来保存元组中的总数。外键将是产品id。嵌套字典将按月份和该月份发生的崩溃总数进行键控

示例:元组中的值是:(产品id、日期、崩溃数)

outages = [
    ('A','2018-01-01', 20),
    ('A','2018-01-01', 20),
    ('A','2018-01-01', 20),
    ('B','2018-01-15', 80),
    ('B','2018-01-19', 200),
    ('A','2018-02-08', 15),
    ('A','2018-02-09', 15),
    ('B','2018-02-15', 80),
    ('B','2018-02-15', 90),
    ('B','2018-02-20', 10),
    ('C','2018-02-25', 120),
    ('A','2018-03-01', 10),
    ('B','2018-04-01', 10),
    ('C','2018-03-01', 5)]

我的预期产出是:

{'A': {1: 60, 2: 30, 3: 10}, 'B': {1: 280, 2: 180, 4: 10}, 'C': {2: 120, 3: 5}}

到目前为止,我掌握的情况如下:

from datetime import datetime

#Create a class to implement missing method for when key is not in the dictionary 
class NestedDict(dict):
    def __missing__(self, key):
        self[key] = NestedDict()
        return self[key]

#Create Instance of NestedDict
nested_dic=NestedDict()

#Loop through outages and created outer and inner key
for x in outages:
    nested_dic[x[0]][datetime.strptime(x[1], '%Y-%m-%d').month] = 0 #=>Need Help Here

我不知道从这里到哪里去得到想要的输出。我将这些值设置为0,因为执行:+= x[2]会产生错误。理想情况下,我希望遍历outages,并在扫描每个元组后更新字典,而不必多次迭代


Tags: keyselfiddatetime字典产品create情况
2条回答

如果您不介意使用^{},它可以非常简洁。其思想是提取月份,将其转换为int,并在每个嵌套字典上累加:

from collections import defaultdict

outages_by_month = defaultdict(lambda: defaultdict(int))

for prod_id, date, crashes in outages:
    outages_by_month[prod_id][int(date[5:7])] += crashes

print(outages_by_month) 

如果您想要普通的dictNestedDict感觉像是一个不必要的抽象),可以从一开始就这样构建它们(手动设置默认键),或者使用如下方法:

outages_by_month = {k: dict(v) for k, v in outages_by_month.items()}

结果:

{'A': {1: 60, 2: 30, 3: 10}, 'B': {1: 280, 2: 180, 4: 10}, 'C': {2: 120, 3: 5}}

使用collections并扩展MutableMapping,它是一个抽象类,具有python字典的接口。其思想是重写python dictionary并定制__setitem__方法:这是唯一需要深思熟虑地实现的方法

import collections


class RunningSumDict(collections.MutableMapping):
    def __init__(self, *args, **kwargs):
        self._d = {}
        if args:
            self._d[args[0]] = args[1]
        if kwargs:
            self._d = kwargs

    def __getitem__(self, key):
        return self._d.get(key)

    """
    key: month
    val: crashes
    """

    def __setitem__(self, key, val):
        if key in self._d:
            self._d[key] = self._d[key] + val
        else:
            self._d[key] = val

    def __delitem__(self, key):
        del self._d[key]

    def __iter__(self):
        return iter(self._d)

    def __len__(self):
        return len(self._d)

    def get(self):
        return self._d

    def __eq__(self, other):
        return self._d == other

    def __repr__(self):
        return repr(self._d)


class OuterDict(collections.MutableMapping):
    def __init__(self, *args, **kwargs):
        self._d = {}

    def __getitem__(self, key):
        return self._d.get(key)

    """
    key: product id
    val: tuple(month, crashes)
    """

    def __setitem__(self, key, val):
        if key in self._d:
            sum_dict = self._d[key]
            sum_dict[val[0]] = val[1]
            self._d[key] = sum_dict
        else:
            self._d[key] = RunningSumDict(*val)

    def __delitem__(self, key):
        del self._d[key]

    def __iter__(self):
        return iter(self._d)

    def __len__(self):
        return len(self._d)

    def get(self):
        return self._d

    def __eq__(self, other):
        return self._d == other

    def __repr__(self):
        return repr(self._d)


def running_sum(outages):
    # Get the months only for the second entry
    outages = [(outage[0], outage[1].split('-')[1], outage[2])
               for outage in outages]

    som = OuterDict()
    for outage in outages:
        som[outage[0]] = (outage[1], outage[2])
    return som


if __name__ == "__main__":
    outages = [
        ('A', '2018-01-01', 20),
        ('A', '2018-01-01', 20),
        ('A', '2018-01-01', 20),
        ('B', '2018-01-15', 80),
        ('B', '2018-01-19', 200),
        ('A', '2018-02-08', 15),
        ('A', '2018-02-09', 15),
        ('B', '2018-02-15', 80),
        ('B', '2018-02-15', 90),
        ('B', '2018-02-20', 10),
        ('C', '2018-02-25', 120),
        ('A', '2018-03-01', 10),
        ('B', '2018-04-01', 10),
        ('C', '2018-03-01', 5)]

    print(running_sum(outages))

结果

{'A': {'01': 60, '02': 30, '03': 10}, 'B': {'01': 280, '02': 180, '04': 10}, 'C': {'02': 120, '03': 5}}

相关问题 更多 >