Python:解析Json以查找与几个键匹配的匹配行,并将每个匹配集作为单个Json记录返回

2024-10-01 17:34:20 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个JSON,如下所示

{"processId":"p1","userId":"user1","reportName":"report1","threadId": "12234", "some_other_keys":"respective values"}
{"userId":"user1","processId":"p1","reportName":"report1","threadId":"12335", "some_other_keys":"respective values"}
{"reportName":"report2","processId":"p1","userId":"user1","threadId":"12434", "some_other_keys":"respective values"}
{"threadId":"12734", "some_other_keys":"respective values", "processId":"p1","userId":"user2","reportName":"report1"}
{"processId":"p1","reportName":"report1","threadId":"12534", "some_other_keys":"respective values","userId":"user2"}
{"processId":"p1","userId":"user1","reportName":"report2","threadId":"12934", "some_other_keys":"respective values"}
{"processId":"p1","userId":"user1","reportName":"report1","threadId":"12834", "some_other_keys":"respective values"}
{"processId":"p1","userId":"user1","reportName":"report2","threadId":"12634", "some_other_keys":"respective values"}

目标:编写一个函数,返回所有不同的行集,这些行的值“processId”、“userId”、“reportName”与单个JSON记录相同,每个匹配记录的键名都经过修改,如下所示

在上面的示例中,有三个匹配集

Set1(对于“processId”:“p1”,“userId”:“user1”,“reportName”:“report1”):

{"processId":"p1","userId":"user1","reportName":"report1","threadId":"12234", "some_other_keys":"respective values"}
{"userId":"user1","processId":"p1","reportName":"report1","threadId":"12335", "some_other_keys":"respective values"}
{"processId":"p1","userId":"user1","reportName":"report1","threadId":"12834", "some_other_keys":"respective values"}

Set2(“processId”:“p1”,“userId”:“user1”,“reportName”:“report2”):

{"reportName":"report2","processId":"p1","userId":"user1","threadId":"12434", "some_other_keys":"respective values"}
{"processId":"p1","userId":"user1","reportName":"report2","threadId":"12934", "some_other_keys":"respective values"}
{"processId":"p1","userId":"user1","reportName":"report2","threadId":"12634", "some_other_keys":"respective values"}

Set3(“processId”:“p1”,“userId”:“user2”,“reportName”:“report2”):

{"threadId":"12734", "some_other_keys":"respective values", "processId":"p1","userId":"user2","reportName":"report1"}
{"processId":"p1","reportName":"report1","threadId":"12534", "some_other_keys":"respective values","userId":"user2"}

因此,在这个特定的示例中,函数应该返回三个不同的集合,如下所示

Set1(对于“processId”:“p1”,“userId”:“user1”,“reportName”:“report1”): {"processId":"p1","userId":"user1","reportName":"report1","threadId_1":"12234", "some_other_keys_1":"respective values", "threadId_2":"12335", "some_other_keys_2":"respective values", "threadId_3":"12834", "some_other_keys_3":"respective values"}

Set2(“processId”:“p1”,“userId”:“user1”,“reportName”:“report2”): {"processId":"p1","userId":"user1","reportName":"report2","threadId_1":"12934", "some_other_keys_1":"respective values","threadId_2":"12434", "some_other_keys_2":"respective values","threadId_3":"12634", "some_other_keys_3":"respective values"}

Set3(“processId”:“p1”,“userId”:“user2”,“reportName”:“report2”): {"threadId_1":"12734", "some_other_keys_1":"respective values", "processId":"p1","userId":"user2","reportName":"report1""threadId_2":"12534", "some_other_keys_2":"respective values"}

因此,一个函数返回三个集合(这可能或多或少也取决于匹配集合的数量)

我需要一个解决上述问题的方案,作为(a)性能高效的代码(b)行数较少的代码,因为我将处理大量的行。因此,我希望我的代码运行得更快,而且代码的行数应该更少


Tags: 函数somekeysvaluesotheruseridp1user1
1条回答
网友
1楼 · 发布于 2024-10-01 17:34:20
import json

f = open('data.json')
data = json.load(f)
f.close()

sets_of_procces = dict()

for item in data:
    set_id = processId, userId, reportName = item['processId'], item['userId'], item['reportName']
    if set_id not in sets_of_procces.keys():
        sets_of_procces[set_id] = []
    thread_number = len(sets_of_procces[set_id]) + 1
    thread_data = { f'threadId_{thread_number}' : item['threadId'], f'some_other_keys_{thread_number}' : item['some_other_keys'] }
    sets_of_procces[set_id].append(json.dumps(thread_data))

for i, procces_set in enumerate(sets_of_procces):
    print(f'Set {i+1} : \n')
    processId, userId, reportName = procces_set
    json_dict = { 'processId' : processId, 'userId' : userId, 'reportName' : reportName }
    for item in sets_of_procces[procces_set]:
        json_dict = {**json_dict, **json.loads(item)}
    print(json.dumps(json_dict))

相关问题 更多 >

    热门问题