如何解析具有相同键的多个字典值

2024-09-28 20:41:31 发布

您现在位置:Python中文网/ 问答频道 /正文

我有许多行数据(我不能手动修改),它们被表示为字典中的键/值对。问题是有一个字典键可以多次出现(对于未定义的数字:可能是两次、三次、10次,等等),具有不同的值。在

我需要提取所有这些值。在

这是一个简单的记录,键Key-Word有两个值:

{"Date": "Fri, 19 Apr 2019 00:54:46 GMT", "Vary": "Host,Accept-Encoding", "Key-Word": "00a", "Cache-Control": "private", "Key-Word": "xn"}

我编写了这个python脚本来提取记录的值。在

import ast
import re
import json


inFile = open("sample.txt","r",errors="replace") 


cP=0 # key found flag
cV=0 # hold the key's value


try:
    myDict = {"Date": "Fri, 19 Apr 2019 00:54:46 GMT", "Vary": "Host,Accept-Encoding", "Key-Word": "00a", "Cache-Control": "private", "Key-Word": "xn"}
    smallmyDict= {}

except (ValueError, SyntaxError) as E:
    cV="error"
except Exception as E:
    cV="error"

# convert the header's key to small letter
for key, value in myDict.items():
    smallmyDict[key.lower()] = value

# store all keys
smallmyDictKeys =smallmyDict.keys()



# search for a specific key
if 'key-word' in smallmyDictKeys: 
    cP=1
    cV = smallmyDict['key-word']
    print("Found!")
    print(cV) #print the key's value
else:
    print("NOT Found!")

我得到的输出是:

Found! xn

问题是它只打印最后一个键的值。在

如果我要查找的键多次出现,如何使代码迭代该键并分别打印每个值,而不是用最后一个值覆盖它?在


Tags: thekeyimportdate字典value记录cv
3条回答

由于密钥重复,无法将数据直接加载到json中,请尝试以下操作:

from collections import defaultdict

string = '{"Date": "Fri, 19 Apr 2019 00:54:46 GMT", "Vary": "Host,Accept-Encoding", "Key-Word": "00a", "Cache-Control": "private", "Key-Word": "xn"}'

pieces = string.split('",')

for each_piece in pieces:
    key, value = each_piece.split(':', maxsplit=1)
    actual_key = key.strip('{"')
    actual_value = value.strip(' "')
    data[actual_key].append(actual_value)

print(data)

输出

^{pr2}$

字典中不能有两个同名的键。一个会覆盖另一个。在运行时,只有一对该键存在(最后一个条目)。在

https://www.python-course.eu/dictionaries.php-是阅读词典的好资源。在

您可以使用json来解析数据,并使用json.loadsobject_pairs_hook参数来个性化处理数据。在下面的示例中,我将同一个键的不同值分组到一个列表中(并且,按照您的注释中的要求,将它们串联成一个字符串):

import json
from collections import Counter, defaultdict

data = """{"Date": "Fri, 19 Apr 2019 00:54:46 GMT", "Vary": "Host,Accept-Encoding", "Key-Word": "00a", "Cache-Control": "private", "Key-Word": "xn"}

"""

def duplicate_keys(pairs):
    out = {}
    dups = defaultdict(list)
    key_count = Counter(key for key, value in pairs)

    for key, value in pairs:
        if key_count[key] == 1:
            out[key] = value
        else:
            dups[key].append(value)

    # Concatenate the lists of values in a string, enclosed in {} and separated by ';'
    # rather than in a list:       
    dups = {key: ';'.join('{' + v + '}' for v in values) for key, values in dups.items()}

    out.update(dups)
    return out

decoded = json.loads(data, object_pairs_hook=duplicate_keys)
print(decoded)

# {'Date': 'Fri, 19 Apr 2019 00:54:46 GMT', 
#  'Vary': 'Host,Accept-Encoding', 
#  'Cache-Control': 'private', 
#  'Key-Word': '{00a};{xn}'}

相关问题 更多 >