如果Python中有相同的键/值,则删除列表中的dict

2024-10-17 08:25:32 发布

您现在位置:Python中文网/ 问答频道 /正文

如果modelurlprice_int相同(重复),如何删除results列表中的dict? JSON示例:

[{
    "id": 1,
    "results": [
        {
            "model": "Audi Audi TT Roadster",
            "price_int": 2200,
            "rzc_result_url": "https://url1.jpg"
        },
        {
            "model": "Audi TT Roadster 1.8 T",
            "price_int": 2999,
            "rzc_result_url": "https://url1.jpg"
        },
        {
            "model": "Audi TT Roadster 1.8 T",
            "price_int": 2999,
            "rzc_result_url": "https://url1.jpg"
        }]
},
...

]

预期产出:

[{
    "id": 1,
    "results": [
        {
            "model": "Audi Audi TT Roadster",
            "price_int": 2200,
            "rzc_result_url": "https://url1.jpg"
        },
        {
            "model": "Audi TT Roadster 1.8 T",
            "price_int": 2999,
            "rzc_result_url": "https://url1.jpg"
        }]
},
...

]

代码:

def removeDoubles():
    results = item["results"]
    if not results == []:
        for result in results:
            urlList = result["url"]
            modelList = result["model"]
            priceIntList = result["price_int"]
            ... What to do ?
removeDoubles()

我知道我还远没有找到解决方案,但如何根据三个键/值删除重复项


Tags: httpsidurlmodelresultpriceresultsint
3条回答

在添加到另一个列表之前,需要迭代输入'result'键值并进行成员资格检查。因此,如果之前已经添加了相同的列表,则基本上不会将其添加到新列表中:

s = []
for x in lst:
    for r in x['results']:
        if r not in s:
            s.append(r)
    x['results'] = s

您可以直接比较DICT以检查它们是否具有相同的键/值

from pprint import pprint
data = [
    {
        "id": 1,
        "results": [
            {
                "model": "Audi Audi TT Roadster",
                "price_int": 2200,
                "rzc_result_url": "https://url1.jpg",
            },
            {
                "model": "Audi TT Roadster 1.8 T",
                "price_int": 2999,
                "rzc_result_url": "https://url1.jpg",
            },
            {
                "rzc_result_url": "https://url1.jpg",
                "model": "Audi TT Roadster 1.8 T",
                "price_int": 2999,
            },
        ],
    },
]

for item in data:
    item['results'] = [result for i, result in enumerate(item['results']) if result not in item['results'][i + 1:]]

pprint(data)

印刷品:

[{'id': 1,
  'results': [{'model': 'Audi Audi TT Roadster',
               'price_int': 2200,
               'rzc_result_url': 'https://url1.jpg'},
              {'model': 'Audi TT Roadster 1.8 T',
               'price_int': 2999,
               'rzc_result_url': 'https://url1.jpg'}]}]

当我们不关心顺序时,通常删除重复项的方法是put them in a set

但是,我们不能按原样将dict放入一个集合,因为它们aren't hashable。我们可以使用这里给出的技巧,以散列形式保存dict数据(这允许集合自然地删除重复项),然后获取原始数据

def remove_duplicates(dicts):
    # A set made from the hashable equivalent of each dict.
    unique = {frozenset(d.items()) for d in dicts}
    # Now we go backwards, building a list from the dict equivalents.
    return [dict(hashable) for hashable in unique]

相关问题 更多 >