如何删除列表中重复的词典？

table = [ {'man':'tim','age':'2','h':'5','w':'40'}, {'man':'jim','age':'4','h':'3','w':'20'}, {'man':'jon','age':'24','h':'5','w':'80'}, {'man':'tim','age':'2','h':'5','w':'40'}, {'man':'tto','age':'7','h':'4','w':'49'} ]

[{'scorecardid': 1, 'progress2': 'preview', 'series2': 'Afghanistan v Zimbabwe in UAE, 2018', 'Commentary1': '/Commentary1', 'commentaryid': 1, 'matchid2': '10', 'matchno2': '5th ODI', 'teams2': 'AFG vs ZIM', 'matchtype2': 'ODI', 'Scorecard1': '/Scorecard1', 'status2': 'Starts on Feb 19 at 10:30 GMT'}, {'six2': '0', 'scorecardid': 2, 'overs5': '4', 'fours1': '0', 'overs10': '20', 'Batting_team_img': 'images/RSA.png', 'wickets20': '5', 'wickets6': '1', 'Bowling_team_img': 'images/IND.png', 'maidens6': '0', 'Batting team': 'RSA', 'matchid2': '9', 'name6': 'Unadkat', 'teams2': 'RSA vs IND', 'wickets10': '9', 'desc10': 'Inns', 'runs5': '32', 'matchtype2': 'T20', 'Scorecard1': '/Scorecard2', 'runs1': '2', 'wickets5': '0', 'runs6': '33', 'runs2': '0', 'maidens5': '0', 'runs20': '203', 'name5': 'Bumrah*', 'progress2': 'complete', 'Commentary1': '/Commentary2', 'fours2': '0', 'series2': 'India tour of South Africa, 2017-18', 'name1': 'Junior Dala*', 'commentaryid': 2, 'matchno2': '1st T20I', 'six1': '0', 'overs6': '4', 'Bowling team': 'IND', 'balls2': '2', 'balls1': '3', 'name2': 'Shamsi', 'overs20': '20', 'runs10': '175', 'desc20': 'Inns', 'status2': 'Ind won by 28 runs'}, {'scorecardid': 3, 'overs5': '0.4', 'fours1': '0', 'overs10': '18.4', 'Batting_team_img': 'images/BAN.png', 'wickets20': '4', 'wickets6': '1', 'Bowling_team_img': 'images/SL.png', 'Batting team': 'BAN', 'matchid2': '6', 'name6': 'Shanaka', 'teams2': 'BAN vs SL', 'wickets10': '10', 'desc10': 'Inns', 'runs5': '3', 'matchtype2': 'T20', 'Scorecard1': '/Scorecard3', 'runs1': '1', 'wickets5': '2', 'runs6': '5', 'maidens5': '0', 'runs20': '210', 'progress2': 'complete', 'Commentary1': '/Commentary3', 'name5': 'Gunathilaka*', 'series2': 'Sri Lanka tour of Bangladesh, 2018', 'name1': 'Nazmul Islam', 'commentaryid': 3, 'matchno2': '2nd T20I', 'six1': '0', 'overs6': '1.5', 'Bowling team': 'SL', 'maidens6': '0', 'balls1': '1', 'overs20': '20', 'runs10': '135', 'desc20': 'Inns', 'status2': 'SL won by 75 runs'}, {'six2': '2', 'scorecardid': 4, 'overs5': '4', 'fours1': '1', 'overs10': '20', 'Batting_team_img': 'images/NZ.png', 'wickets20': '7', 'wickets6': '1', 'Bowling_team_img': 'images/ENG.png', 'maidens6': '0', 'Batting team': 'NZ', 'matchid2': '4', 'name6': 'Tom Curran', 'teams2': 'NZ vs ENG', 'wickets10': '4', 'desc10': 'Inns', 'runs5': '41', 'matchtype2': 'T20', 'Scorecard1': '/Scorecard4', 'runs1': '7', 'wickets5': '0', 'runs6': '32', 'runs2': '37', 'maidens5': '0', 'runs20': '194', 'name5': 'Chris Jordan*', 'progress2': 'complete', 'Commentary1': '/Commentary4', 'fours2': '2', 'series2': 'England, Australia, New Zealand T20I Tri-Series, 2018', 'name1': 'de Grandhomme*', 'commentaryid': 4, 'matchno2': '6th Match', 'six1': '0', 'overs6': '3', 'Bowling team': 'ENG', 'balls2': '30', 'balls1': '5', 'name2': 'Chapman', 'overs20': '20', 'runs10': '192', 'desc20': 'Inns', 'status2': 'Eng won by 2 runs'}, {'scorecardid': 5, 'overs5': '7.4', 'fours1': '3', 'runs20': '213', 'six2': '0', 'commentaryid': 5, 'Batting team': 'SAUS', 'matchid2': '18770', 'matchno2': '21st Match', 'wickets10': '3', 'overs10': '49.4', 'matchtype2': 'TEST', 'runs1': '26', 'overs6': '8', 'runs6': '39', 'runs2': '49', 'name1': 'Mennie*', 'name5': 'Daniel Fallins*', 'series2': 'Sheffield Shield, 2017-18', 'Commentary1': '/Commentary5', 'wickets6': '1', 'runs11': '281', 'six1': '0', 'runs10': '192', 'balls1': '58', 'overs11': '74.1', 'maidens5': '1', 'desc21': '1st Inns', 'status2': 'South Aus won by 7 wkts', 'runs5': '51', 'wickets11': '10', 'desc11': '1st Inns', 'desc20': '2nd Inns', 'wickets20': '10', 'wickets21': '10', 'teams2': 'NSW vs SAUS', 'balls2': '85', 'Scorecard1': '/Scorecard5', 'wickets5': '1', 'progress2': 'Result', 'runs21': '256', 'fours2': '6', 'desc10': '2nd Inns', 'name6': 'Stobo', 'maidens6': '1', 'Bowling team': 'NSW', 'name2': 'Ferguson', 'overs20': '68.4', 'overs21': '90.4'}, {'six2': '0', 'scorecardid': 6, 'overs5': '4', 'fours1': '0', 'overs10': '20', 'Batting_team_img': 'images/RSA.png', 'wickets20': '5', 'wickets6': '1', 'Bowling_team_img': 'images/IND.png', 'maidens6': '0', 'Batting team': 'RSA', 'matchid2': '19166', 'name6': 'Unadkat', 'teams2': 'RSA vs IND', 'wickets10': '9', 'desc10': 'Inns', 'runs5': '32', 'matchtype2': 'T20', 'Scorecard1': '/Scorecard6', 'runs1': '2', 'wickets5': '0', 'runs6': '33', 'runs2': '0', 'maidens5': '0', 'runs20': '203', 'name5': 'Bumrah*', 'progress2': 'Result', 'Commentary1': '/Commentary6', 'fours2': '0', 'series2': 'India tour of South Africa, 2017-18', 'name1': 'Junior Dala*', 'commentaryid': 6, 'matchno2': '1st T20I', 'six1': '0', 'overs6': '4', 'Bowling team': 'IND', 'balls2': '2', 'balls1': '3', 'name2': 'Shamsi', 'overs20': '20', 'runs10': '175', 'desc20': 'Inns', 'status2': 'Ind won by 28 runs'}]

3条回答

网友

1楼 · 编辑于 2024-06-26 17:48:59

由于您的记录似乎没有唯一的标识符来区分记录，因此需要对所有键值对进行哈希运算。只要字典中没有嵌套的可变对象，这种方法就可以工作。你知道吗

我将在这里使用OrderedDict来维持秩序。你知道吗

from collections import OrderedDict
list(
     map(
         dict, 
         OrderedDict.fromkeys(
             map(frozenset, map(dict.items, table)), None
         )
     )
)

[{'age': '2', 'h': '5', 'man': 'tim', 'w': '40'},
 {'age': '4', 'h': '3', 'man': 'jim', 'w': '20'},
 {'age': '24', 'h': '5', 'man': 'jon', 'w': '80'},
 {'age': '7', 'h': '4', 'man': 'tto', 'w': '49'}]

事情是这样的：

将每个字典转换为frozenset个tuple的frozenset个可散列。你知道吗
将每个frozenset作为键散列到OrderedDict。重复项将自动删除。你知道吗
检索键并转换回字典列表。你知道吗

有许多方法可以重现上述算法。我使用了python提供的函数式编程工具map。你知道吗

网友

2楼 · 编辑于 2024-06-26 17:48:59

如果可以将重复数据散列到一个集合中，则可以找到并删除重复数据。一种方法是：

代码：

def remove_dupes(a_list):
    already_have = set()
    new_table = []
    for row in a_list:
        row_hashable = tuple(sorted(row.items()))
        if row_hashable not in already_have:
            new_table.append(row)
            already_have.add(row_hashable)
    return new_table

测试代码：

table = [
    {'man': 'tim', 'age': '2', 'h': '5', 'w': '40'},
    {'man': 'jim', 'age': '4', 'h': '3', 'w': '20'},
    {'man': 'jon', 'age': '24', 'h': '5', 'w': '80'},
    {'man': 'tim', 'age': '2', 'h': '5', 'w': '40'},
    {'man': 'tto', 'age': '7', 'h': '4', 'w': '49'}
]

print(remove_dupes(table))

结果：

[    
    {'man': 'tim', 'age': '2', 'h': '5', 'w': '40'}, 
    {'man': 'jim', 'age': '4', 'h': '3', 'w': '20'}, 
    {'man': 'jon', 'age': '24', 'h': '5', 'w': '80'},
    {'man': 'tto', 'age': '7', 'h': '4', 'w': '49'}
]

网友

3楼 · 编辑于 2024-06-26 17:48:59

list(map(dict, {tuple(sorted(t.items())):1 for t in table}.keys()))

或者，使用集合：

list(map(dict, set(tuple(sorted(t.items())) for t in table)))

正如@cᴏʟᴅsᴘᴇᴇᴅ所指出的，上述解决方案不能维持Python<；3.6中的秩序。你知道吗

以下是维持秩序的解决方案：

singlev = []
for k, v in enumerate([tuple(sorted(t.items())) for t in table]):
    if v not in singlev:
        singlev.append(table[k])

代码：

测试代码：

结果：

相关问题更多 >

编程相关推荐

热门问题

热门文章