检测JSON列表中的重复项并将其删除

{ "alerts": [ { "description": "Es tritt leichter Frost auf.", "end": 1613379600, "event": "FROST", "lang": "de", "sender_name": "DWD / Nationales Warnzentrum Offenbach", "start": 1613322000 }, { "description": "There is a risk of frost", "end": 1613379600, "event": "frost", "lang": "en", "sender_name": "DWD / Nationales Warnzentrum Offenbach", "start": 1613322000 }, { "description": "There is a risk of wind gusts", "end": 1613408400, "event": "wind gusts", "lang": "en", "sender_name": "DWD / Nationales Warnzentrum Offenbach", "start": 1613336400 }}

{ "alerts": [ { "description": "Es tritt leichter Frost auf.", "end": 1613379600, "event": "FROST", "lang": "de", "sender_name": "DWD / Nationales Warnzentrum Offenbach", "start": 1613322000 }, { "description": "There is a risk of wind gusts", "end": 1613408400, "event": "wind gusts", "lang": "en", "sender_name": "DWD / Nationales Warnzentrum Offenbach", "start": 1613336400 }}

3条回答

网友

1楼 · 编辑于 2024-09-30 06:34:04

按lang按相反顺序对输入列表排序-en将出现在de之前，然后制作一个dict，其中键是tuple(start, end)并使用dict.values()。因为de将在en之后出现。如果存在具有相同密钥start、end的警报，de将更新该密钥的值

data = {
"alerts": [
    {
        "description": "Es tritt leichter Frost auf.",
        "end": 1613379600,
        "event": "FROST",
        "lang": "de",
        "sender_name": "DWD / Nationales Warnzentrum Offenbach",
        "start": 1613322000
    },
    {
        "description": "There is a risk of wind gusts",
        "end": 1613408400,
        "event": "wind gusts",
        "lang": "en",
        "sender_name": "DWD / Nationales Warnzentrum Offenbach",
        "start": 1613336400
    }]}

unique = {(item['start'], item['end']):item for item in
           sorted(data['alerts'], key=lambda x: x['lang'], reverse=True)}
data['alerts'] = sorted(unique.values(), key=lambda x: (x['start'], x['end']))

输出

{
    "alerts": [
        {
            "description": "Es tritt leichter Frost auf.",
            "end": 1613379600,
            "event": "FROST",
            "lang": "de",
            "sender_name": "DWD / Nationales Warnzentrum Offenbach",
            "start": 1613322000
        },
        {
            "description": "There is a risk of wind gusts",
            "end": 1613408400,
            "event": "wind gusts",
            "lang": "en",
            "sender_name": "DWD / Nationales Warnzentrum Offenbach",
            "start": 1613336400
        }
    ]
}

不确定是否需要按时间排序的结果，以便删除该部分

网友

2楼 · 编辑于 2024-09-30 06:34:04

您可以通过字典理解进行过滤：

 data = {
"alerts": [
    {
        "description": "Es tritt leichter Frost auf.",
        "end": 1613379600,
        "event": "FROST",
        "lang": "de",
        "sender_name": "DWD / Nationales Warnzentrum Offenbach",
        "start": 1613322000
    },
    {
        "description": "There is a risk of frost",
        "end": 1613379600,
        "event": "frost",
        "lang": "en",
        "sender_name": "DWD / Nationales Warnzentrum Offenbach",
        "start": 1613322000
    },
    {
        "description": "There is a risk of wind gusts",
        "end": 1613408400,
        "event": "wind gusts",
        "lang": "en",
        "sender_name": "DWD / Nationales Warnzentrum Offenbach",
        "start": 1613336400
    }]}

filtered = {(entry["start"], entry["end"]): entry for entry in reversed(data["alerts"])}

data["alerts"] = list(filtered.values())

这种方法利用了重复的字典键被最后一个条目覆盖的事实。如果要保留最后一个重复条目而不是第一个条目，请删除reversed()

网友

3楼 · 编辑于 2024-09-30 06:34:04

您可以使用^{} [Python-docs]对所有类似的时间戳进行分组，然后选择英语文档

from itertools import groupby

data["alerts"] = sorted(data["alerts"], key=lambda x: (x["end"], x["start"]))
data["alerts"] = [
    g
    for key, group in groupby(data["alerts"], key=lambda x: (x["end"], x["start"]))
    for g in group
    if g["lang"] == "en"  # change accordingly
]

相关问题更多 >

编程相关推荐

热门问题

热门文章