从字典列表中,获取键列具有最高严重性值的所有字典

2024-09-28 23:41:56 发布

您现在位置:Python中文网/ 问答频道 /正文

给出字典列表如下:

dictionaries = [
    {'column': 'NRX_TOTAL', 'severity': 1, 'threshold': 0.1},
    {'column': 'TRX_TOTAL', 'severity': 1, 'threshold': 0.1},
    {'column': 'NRX_TOTAL', 'severity': 2, 'threshold': 0.15},
    {'column': 'TRX_TOTAL', 'severity': 2, 'threshold': 0.15},
    {'column': 'NRX_TOTAL', 'severity': 3, 'threshold': 0.25},
    {'column': 'TRX_TOTAL', 'severity': 3, 'threshold': 0.25},
    {'column': 'TRX_TOTAL', 'severity': 4, 'threshold': 0.25}
]

我想要一个字典的最终列表,每个'column'键的'severity'值最高。例如:

上述输出应为:

output = [{'column': 'NRX_TOTAL', 'severity': 3, 'threshold': 0.25}, 
          {'column': 'TRX_TOTAL', 'severity': 4, 'threshold': 0.25}]

因为NRX_TOTAL列的最高'severity'是3,而TRX_Total列的最高'severity'是4

下面是一个代码片段,它完成了这项工作。关于如何改进有什么想法吗

l_measure_thresholds_with_highest_severity_temp = []
l_disctinct_column_value = []

for l_dict in dictionaries:
    l_temp_dict = {'column': '', 'severity': 0, 'threshold': 0}
    x = l_dict['column']
    if x not in l_disctinct_column_value:
        l_disctinct_column_value.append(x)
        l_temp_dict['column'] = x
        l_measure_thresholds_with_highest_severity_temp.append(l_temp_dict)

l_measure_thresholds_with_highest_severity = list()

for i in l_measure_thresholds_with_highest_severity_temp:
    l_temp_dict = {'column': '', 'severity': 0, 'threshold': 0}
    dict_i_col = i['column']
    dict_i_sev = i['severity']
    dict_i_threshold = i['threshold']

    for j in dictionaries:   
        dict_j_col = j['column']
        dict_j_sev = j['severity']
        dict_j_threshold = j['threshold']
        if dict_i_col == dict_j_col:
            if dict_i_sev < dict_j_sev:
                l_highest_severity = dict_j_sev
                l_highest_threshold = dict_j_threshold
            else:
                l_highest_severity = dict_i_sev
                l_highest_threshold = dict_i_threshold
            l_temp_dict['column'] = dict_i_col
            l_temp_dict['severity'] = l_highest_severity
            l_temp_dict['threshold'] = l_highest_threshold
    l_measure_thresholds_with_highest_severity.append(l_temp_dict)
    
print(l_measure_thresholds_with_highest_severity)

Tags: inthresholdwithcolumncoltempdicttotal
3条回答

您可以在'column和该列具有最大严重性的dict之间创建一个新映射。为了实现这一点,在第一次比较中使用^{}是有帮助的:

from collections import defaultdict

severities = defaultdict(lambda: {'severity': 0})

for d in dictionaries:
    column = d['column']
    if d['severity'] > severities[column]['severity']:
        severities[column] = d

print(list(severities.values()))

defaultdict用于第一次比较,以创建严重性为0的“伪”dict。然后,每当发现具有相同列且严重性更高的dict时,它都保存在severitiesdict中。最后,我们只打印列表中的values原始dict

在您的词典列表中,上面将给出:

[{'column': 'NRX_TOTAL', 'severity': 3, 'threshold': 0.25}, 
 {'column': 'TRX_TOTAL', 'severity': 4, 'threshold': 0.25}]

这应该可以做到:

a = [
    {'column': 'NRX_TOTAL', 'severity': 1, 'threshold': 0.1},
    {'column': 'TRX_TOTAL', 'severity': 1, 'threshold': 0.1},
    {'column': 'NRX_TOTAL', 'severity': 2, 'threshold': 0.15},
    {'column': 'TRX_TOTAL', 'severity': 2, 'threshold': 0.15},
    {'column': 'NRX_TOTAL', 'severity': 3, 'threshold': 0.25},
    {'column': 'TRX_TOTAL', 'severity': 3, 'threshold': 0.25}
]

list_ = []
high_num = 0
for item in a:
    if item['severity'] > high_num:
        high_num = item['severity']
for item in a:
    if item['severity'] == high_num:
        list_.append(item)

print(list_)

以下是您可以使用的通用函数:

a = [
    {'column': 'NRX_TOTAL', 'severity': 1, 'threshold': 0.1},
    {'column': 'TRX_TOTAL', 'severity': 1, 'threshold': 0.1},
    {'column': 'NRX_TOTAL', 'severity': 2, 'threshold': 0.15},
    {'column': 'TRX_TOTAL', 'severity': 2, 'threshold': 0.15},
    {'column': 'NRX_TOTAL', 'severity': 3, 'threshold': 0.25},
    {'column': 'TRX_TOTAL', 'severity': 3, 'threshold': 0.25}
]


def max_group_by(lst, group, value):
    '''
    calculatie max values by group key within dict d
    '''
    result = []
    groups = []
    for d in lst:
        g = d.get(group)
        if g and g not in groups:
            v = d.get(value)
            groups.append(g)
            glist = [d2 for d2 in lst if d2.get(group) == g]
            maxval = max(glist, key=lambda x: x.get(value))
            result.append(maxval)
    return result

print(max_group_by(a, 'column', 'severity'))
# [{'column': 'NRX_TOTAL', 'severity': 3, 'threshold': 0.25}, 
#  {'column': 'TRX_TOTAL', 'severity': 3, 'threshold': 0.25}]

相关问题 更多 >