如何删除列表中重复的词典?

2024-06-26 17:48:59 发布

您现在位置:Python中文网/ 问答频道 /正文

对于动态值,有时值会不断重复,例如如果变量

table = [
    {'man':'tim','age':'2','h':'5','w':'40'},
    {'man':'jim','age':'4','h':'3','w':'20'},
    {'man':'jon','age':'24','h':'5','w':'80'}, 
    {'man':'tim','age':'2','h':'5','w':'40'},
    {'man':'tto','age':'7','h':'4','w':'49'}    
]

这里{'man':'tim','age':'2','h':'5','w':'40'}字典集重复两次这些都是动态值。你知道吗

如何停止重复此操作,使列表在呈现到模板之前不包含任何重复字典?你知道吗

编辑:实际数据

[{'scorecardid': 1, 'progress2': 'preview', 'series2': 'Afghanistan v Zimbabwe in UAE, 2018', 'Commentary1': '/Commentary1', 'commentaryid': 1, 'matchid2': '10', 'matchno2': '5th ODI', 'teams2': 'AFG vs ZIM', 'matchtype2': 'ODI', 'Scorecard1': '/Scorecard1', 'status2': 'Starts on Feb 19 at 10:30 GMT'}, {'six2': '0', 'scorecardid': 2, 'overs5': '4', 'fours1': '0', 'overs10': '20', 'Batting_team_img': 'images/RSA.png', 'wickets20': '5', 'wickets6': '1', 'Bowling_team_img': 'images/IND.png', 'maidens6': '0', 'Batting team': 'RSA', 'matchid2': '9', 'name6': 'Unadkat', 'teams2': 'RSA vs IND', 'wickets10': '9', 'desc10': 'Inns', 'runs5': '32', 'matchtype2': 'T20', 'Scorecard1': '/Scorecard2', 'runs1': '2', 'wickets5': '0', 'runs6': '33', 'runs2': '0', 'maidens5': '0', 'runs20': '203', 'name5': 'Bumrah*', 'progress2': 'complete', 'Commentary1': '/Commentary2', 'fours2': '0', 'series2': 'India tour of South Africa, 2017-18', 'name1': 'Junior Dala*', 'commentaryid': 2, 'matchno2': '1st T20I', 'six1': '0', 'overs6': '4', 'Bowling team': 'IND', 'balls2': '2', 'balls1': '3', 'name2': 'Shamsi', 'overs20': '20', 'runs10': '175', 'desc20': 'Inns', 'status2': 'Ind won by 28 runs'}, {'scorecardid': 3, 'overs5': '0.4', 'fours1': '0', 'overs10': '18.4', 'Batting_team_img': 'images/BAN.png', 'wickets20': '4', 'wickets6': '1', 'Bowling_team_img': 'images/SL.png', 'Batting team': 'BAN', 'matchid2': '6', 'name6': 'Shanaka', 'teams2': 'BAN vs SL', 'wickets10': '10', 'desc10': 'Inns', 'runs5': '3', 'matchtype2': 'T20', 'Scorecard1': '/Scorecard3', 'runs1': '1', 'wickets5': '2', 'runs6': '5', 'maidens5': '0', 'runs20': '210', 'progress2': 'complete', 'Commentary1': '/Commentary3', 'name5': 'Gunathilaka*', 'series2': 'Sri Lanka tour of Bangladesh, 2018', 'name1': 'Nazmul Islam', 'commentaryid': 3, 'matchno2': '2nd T20I', 'six1': '0', 'overs6': '1.5', 'Bowling team': 'SL', 'maidens6': '0', 'balls1': '1', 'overs20': '20', 'runs10': '135', 'desc20': 'Inns', 'status2': 'SL won by 75 runs'}, {'six2': '2', 'scorecardid': 4, 'overs5': '4', 'fours1': '1', 'overs10': '20', 'Batting_team_img': 'images/NZ.png', 'wickets20': '7', 'wickets6': '1', 'Bowling_team_img': 'images/ENG.png', 'maidens6': '0', 'Batting team': 'NZ', 'matchid2': '4', 'name6': 'Tom Curran', 'teams2': 'NZ vs ENG', 'wickets10': '4', 'desc10': 'Inns', 'runs5': '41', 'matchtype2': 'T20', 'Scorecard1': '/Scorecard4', 'runs1': '7', 'wickets5': '0', 'runs6': '32', 'runs2': '37', 'maidens5': '0', 'runs20': '194', 'name5': 'Chris Jordan*', 'progress2': 'complete', 'Commentary1': '/Commentary4', 'fours2': '2', 'series2': 'England, Australia, New Zealand T20I Tri-Series, 2018', 'name1': 'de Grandhomme*', 'commentaryid': 4, 'matchno2': '6th Match', 'six1': '0', 'overs6': '3', 'Bowling team': 'ENG', 'balls2': '30', 'balls1': '5', 'name2': 'Chapman', 'overs20': '20', 'runs10': '192', 'desc20': 'Inns', 'status2': 'Eng won by 2 runs'}, {'scorecardid': 5, 'overs5': '7.4', 'fours1': '3', 'runs20': '213', 'six2': '0', 'commentaryid': 5, 'Batting team': 'SAUS', 'matchid2': '18770', 'matchno2': '21st Match', 'wickets10': '3', 'overs10': '49.4', 'matchtype2': 'TEST', 'runs1': '26', 'overs6': '8', 'runs6': '39', 'runs2': '49', 'name1': 'Mennie*', 'name5': 'Daniel Fallins*', 'series2': 'Sheffield Shield, 2017-18', 'Commentary1': '/Commentary5', 'wickets6': '1', 'runs11': '281', 'six1': '0', 'runs10': '192', 'balls1': '58', 'overs11': '74.1', 'maidens5': '1', 'desc21': '1st Inns', 'status2': 'South Aus won by 7 wkts', 'runs5': '51', 'wickets11': '10', 'desc11': '1st Inns', 'desc20': '2nd Inns', 'wickets20': '10', 'wickets21': '10', 'teams2': 'NSW vs SAUS', 'balls2': '85', 'Scorecard1': '/Scorecard5', 'wickets5': '1', 'progress2': 'Result', 'runs21': '256', 'fours2': '6', 'desc10': '2nd Inns', 'name6': 'Stobo', 'maidens6': '1', 'Bowling team': 'NSW', 'name2': 'Ferguson', 'overs20': '68.4', 'overs21': '90.4'}, {'six2': '0', 'scorecardid': 6, 'overs5': '4', 'fours1': '0', 'overs10': '20', 'Batting_team_img': 'images/RSA.png', 'wickets20': '5', 'wickets6': '1', 'Bowling_team_img': 'images/IND.png', 'maidens6': '0', 'Batting team': 'RSA', 'matchid2': '19166', 'name6': 'Unadkat', 'teams2': 'RSA vs IND', 'wickets10': '9', 'desc10': 'Inns', 'runs5': '32', 'matchtype2': 'T20', 'Scorecard1': '/Scorecard6', 'runs1': '2', 'wickets5': '0', 'runs6': '33', 'runs2': '0', 'maidens5': '0', 'runs20': '203', 'name5': 'Bumrah*', 'progress2': 'Result', 'Commentary1': '/Commentary6', 'fours2': '0', 'series2': 'India tour of South Africa, 2017-18', 'name1': 'Junior Dala*', 'commentaryid': 6, 'matchno2': '1st T20I', 'six1': '0', 'overs6': '4', 'Bowling team': 'IND', 'balls2': '2', 'balls1': '3', 'name2': 'Shamsi', 'overs20': '20', 'runs10': '175', 'desc20': 'Inns', 'status2': 'Ind won by 28 runs'}]

Tags: imgagepngteamimagesbowlingmanprogress2
3条回答

由于您的记录似乎没有唯一的标识符来区分记录,因此需要对所有键值对进行哈希运算。只要字典中没有嵌套的可变对象,这种方法就可以工作。你知道吗

我将在这里使用OrderedDict来维持秩序。你知道吗

from collections import OrderedDict
list(
     map(
         dict, 
         OrderedDict.fromkeys(
             map(frozenset, map(dict.items, table)), None
         )
     )
)

[{'age': '2', 'h': '5', 'man': 'tim', 'w': '40'},
 {'age': '4', 'h': '3', 'man': 'jim', 'w': '20'},
 {'age': '24', 'h': '5', 'man': 'jon', 'w': '80'},
 {'age': '7', 'h': '4', 'man': 'tto', 'w': '49'}]

事情是这样的:

  1. 将每个字典转换为frozensettuplefrozenset个可散列。你知道吗
  2. 将每个frozenset作为键散列到OrderedDict。重复项将自动删除。你知道吗
  3. 检索键并转换回字典列表。你知道吗

有许多方法可以重现上述算法。我使用了python提供的函数式编程工具map。你知道吗

如果可以将重复数据散列到一个集合中,则可以找到并删除重复数据。一种方法是:

代码:

def remove_dupes(a_list):
    already_have = set()
    new_table = []
    for row in a_list:
        row_hashable = tuple(sorted(row.items()))
        if row_hashable not in already_have:
            new_table.append(row)
            already_have.add(row_hashable)
    return new_table

测试代码:

table = [
    {'man': 'tim', 'age': '2', 'h': '5', 'w': '40'},
    {'man': 'jim', 'age': '4', 'h': '3', 'w': '20'},
    {'man': 'jon', 'age': '24', 'h': '5', 'w': '80'},
    {'man': 'tim', 'age': '2', 'h': '5', 'w': '40'},
    {'man': 'tto', 'age': '7', 'h': '4', 'w': '49'}
]

print(remove_dupes(table))

结果:

[    
    {'man': 'tim', 'age': '2', 'h': '5', 'w': '40'}, 
    {'man': 'jim', 'age': '4', 'h': '3', 'w': '20'}, 
    {'man': 'jon', 'age': '24', 'h': '5', 'w': '80'},
    {'man': 'tto', 'age': '7', 'h': '4', 'w': '49'}
]
list(map(dict, {tuple(sorted(t.items())):1 for t in table}.keys()))

或者,使用集合:

list(map(dict, set(tuple(sorted(t.items())) for t in table)))

正如@cᴏʟᴅsᴘᴇᴇᴅ所指出的,上述解决方案不能维持Python<;3.6中的秩序。你知道吗

以下是维持秩序的解决方案:

singlev = []
for k, v in enumerate([tuple(sorted(t.items())) for t in table]):
    if v not in singlev:
        singlev.append(table[k])

相关问题 更多 >