Python如何比较多个dict并删除重复的值?

2024-09-30 07:23:21 发布

您现在位置:Python中文网/ 问答频道 /正文

正如你在下面看到的,我有一个“主”字典,其中每个值都是一个dict本身。现在我要比较主词的(可以超过2)“name”值,例如“DE,Stuttgart”与“DE,Dresden”和X,只剩下唯一的“name”值。你知道吗

例如,我知道x for x in y if x['key'] != None结构,但据我所知,我只能用它来过滤单个字典。你知道吗

输入:

"DE, Stuttgart": [
    {
        "url": "http://twitter.com/search?q=%23ISIS", 
        "query": "%23ISIS", 
        "tweet_volume": 21646, 
        "name": "#ISIS", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%22Hans+Rosling%22", 
        "query": "%22Hans+Rosling%22", 
        "tweet_volume": 44855, 
        "name": "Hans Rosling", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%22Betsy+DeVos%22", 
        "query": "%22Betsy+DeVos%22", 
        "tweet_volume": 664741, 
        "name": "Betsy DeVos", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=Nioh", 
        "query": "Nioh", 
        "tweet_volume": 24160, 
        "name": "Nioh", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%23FCBWOB", 
        "query": "%23FCBWOB", 
        "tweet_volume": 14216, 
        "name": "#FCBWOB", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%23sid2017", 
        "query": "%23sid2017", 
        "tweet_volume": 28277, 
        "name": "#sid2017", 
        "promoted_content": null
    }
], 
"DE, Dresden": [
    {
        "url": "http://twitter.com/search?q=%22Hans+Rosling%22", 
        "query": "%22Hans+Rosling%22", 
        "tweet_volume": 44855, 
        "name": "Hans Rosling", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%22Betsy+DeVos%22", 
        "query": "%22Betsy+DeVos%22", 
        "tweet_volume": 664741, 
        "name": "Betsy DeVos", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=Nioh", 
        "query": "Nioh", 
        "tweet_volume": 24160, 
        "name": "Nioh", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%23FCBWOB", 
        "query": "%23FCBWOB", 
        "tweet_volume": 14216, 
        "name": "#FCBWOB", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%23sid2017", 
        "query": "%23sid2017", 
        "tweet_volume": 28277, 
        "name": "#sid2017", 
        "promoted_content": null
    }
], 

输出:

"DE, Stuttgart": [
        {
            "url": "http://twitter.com/search?q=%23ISIS", 
            "query": "%23ISIS", 
            "tweet_volume": 21646, 
            "name": "#ISIS", 
            "promoted_content": null
        }
    ], 
    "DE, Dresden": [
    ], 

Tags: namecomhttpurlsearchdetwittercontent
3条回答

假设d1d2是你的两本字典。您可以通过以下方式获取d1中不在d2中的键的列表:

[k for k in d if k not in d2]

您可以将名称收集到^{},然后重建原始dict,同时只保留那些具有唯一名称的子dict:

main = {
    "DE, Stuttgart": [
        {
            "url": "http://twitter.com/search?q=%23ISIS",
            "query": "%23ISIS",
            "tweet_volume": 21646,
            "name": "#ISIS",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%22Hans+Rosling%22",
            "query": "%22Hans+Rosling%22",
            "tweet_volume": 44855,
            "name": "Hans Rosling",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%22Betsy+DeVos%22",
            "query": "%22Betsy+DeVos%22",
            "tweet_volume": 664741,
            "name": "Betsy DeVos",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=Nioh",
            "query": "Nioh",
            "tweet_volume": 24160,
            "name": "Nioh",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%23FCBWOB",
            "query": "%23FCBWOB",
            "tweet_volume": 14216,
            "name": "#FCBWOB",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%23sid2017",
            "query": "%23sid2017",
            "tweet_volume": 28277,
            "name": "#sid2017",
            "promoted_content": None
        }
    ],
    "DE, Dresden": [
        {
            "url": "http://twitter.com/search?q=%22Hans+Rosling%22",
            "query": "%22Hans+Rosling%22",
            "tweet_volume": 44855,
            "name": "Hans Rosling",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%22Betsy+DeVos%22",
            "query": "%22Betsy+DeVos%22",
            "tweet_volume": 664741,
            "name": "Betsy DeVos",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=Nioh",
            "query": "Nioh",
            "tweet_volume": 24160,
            "name": "Nioh",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%23FCBWOB",
            "query": "%23FCBWOB",
            "tweet_volume": 14216,
            "name": "#FCBWOB",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%23sid2017",
            "query": "%23sid2017",
            "tweet_volume": 28277,
            "name": "#sid2017",
            "promoted_content": None
        }
    ]
}
from collections import Counter
import pprint

names = Counter(d['name'] for l in main.values() for d in l)
result = {k: [d for d in v if names[d['name']] == 1] for k, v in main.items()}

pprint.pprint(result)

输出:

{'DE, Dresden': [],
 'DE, Stuttgart': [{'name': '#ISIS',
                    'promoted_content': None,
                    'query': '%23ISIS',
                    'tweet_volume': 21646,
                    'url': 'http://twitter.com/search?q=%23ISIS'}]}

这将为任意数量的位置输出所需的dict。请注意,@niemmi的解决方案效率更高:

main_dict = {"DE, Stuttgart": [
    {
        "url": "http://twitter.com/search?q=%23ISIS", 
        "query": "%23ISIS", 
        "tweet_volume": 21646, 
        "name": "#ISIS", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%22Hans+Rosling%22", 
        "query": "%22Hans+Rosling%22", 
        "tweet_volume": 44855, 
        "name": "Hans Rosling", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%22Betsy+DeVos%22", 
        "query": "%22Betsy+DeVos%22", 
        "tweet_volume": 664741, 
        "name": "Betsy DeVos", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=Nioh", 
        "query": "Nioh", 
        "tweet_volume": 24160, 
        "name": "Nioh", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%23FCBWOB", 
        "query": "%23FCBWOB", 
        "tweet_volume": 14216, 
        "name": "#FCBWOB", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%23sid2017", 
        "query": "%23sid2017", 
        "tweet_volume": 28277, 
        "name": "#sid2017", 
        "promoted_content": None
    }
], 
"DE, Dresden": [
    {
        "url": "http://twitter.com/search?q=%22Hans+Rosling%22", 
        "query": "%22Hans+Rosling%22", 
        "tweet_volume": 44855, 
        "name": "Hans Rosling", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%22Betsy+DeVos%22", 
        "query": "%22Betsy+DeVos%22", 
        "tweet_volume": 664741, 
        "name": "Betsy DeVos", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=Nioh", 
        "query": "Nioh", 
        "tweet_volume": 24160, 
        "name": "Nioh", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%23FCBWOB", 
        "query": "%23FCBWOB", 
        "tweet_volume": 14216, 
        "name": "#FCBWOB", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%23sid2017", 
        "query": "%23sid2017", 
        "tweet_volume": 28277, 
        "name": "#sid2017", 
        "promoted_content": None
    }
]
}

def get_names(main_dict, location):
    return {small_dict["name"] for small_dict in main_dict[location]}

def get_names_from_other_locations(main_dict, location):
    other_locations = [other_loc for other_loc in main_dict if other_loc != location]
    return {small_dict["name"] for other_location in other_locations for small_dict in main_dict[other_location]}

def get_uniq_names(main_dict, location):
    return get_names(main_dict, location) - get_names_from_other_locations(main_dict, location)

def get_dict(main_dict, location, name):
    for small_dict in main_dict[location]:
        if small_dict["name"] == name:
            return small_dict
    return None

print {location: [get_dict(main_dict,location,uniq_name) for uniq_name in get_uniq_names(main_dict, location)] for location in main_dict }
# {'DE, Stuttgart': [{'url': 'http://twitter.com/search?q=%23ISIS', 'query': '%23ISIS', 'tweet_volume': 21646, 'name': '#ISIS', 'promoted_content': None}], 'DE, Dresden': []}

相关问题 更多 >

    热门问题