Python:解析Json以查找与几个键匹配的匹配行

2024-10-01 17:28:15 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个Json,看起来像这样

{"processId":"p1","userId":"user1","reportName":"report1","threadId":"12234", "some_other_keys":"respective values"}
{"userId":"user1","processId":"p1","reportName":"report1","threadId":"12335", "some_other_keys":"respective values"}
{"reportName":"report2","processId":"p1","userId":"user1","threadId":"12434", "some_other_keys":"respective values"}
{"threadId":"12734", "some_other_keys":"respective values", "processId":"p1","userId":"user2","reportName":"report1"}
{"processId":"p1","reportName":"report1","threadId":"12534", "some_other_keys":"respective values","userId":"user2"}
{"processId":"p1","userId":"user1","reportName":"report2","threadId":"12934", "some_other_keys":"respective values"}
{"processId":"p1","userId":"user1","reportName":"report1","threadId":"12834", "some_other_keys":"respective values"}
{"processId":"p1","userId":"user1","reportName":"report2","threadId":"12634", "some_other_keys":"respective values"}

目标:编写一个函数,返回所有不同的行集,这些行具有相同的“processId”、“userId”、“reportName”值

因此,在这个特定的示例中,函数应该返回三个不同的集合

Set1(对于“processId”:“p1”,“userId”:“user1”,“reportName”:“report1”):

{"processId":"p1","userId":"user1","reportName":"report1","threadId":"12234", "some_other_keys":"respective values"}*
{"userId":"user1","processId":"p1","reportName":"report1","threadId":"12335", "some_other_keys":"respective values"}*
{"processId":"p1","userId":"user1","reportName":"report1","threadId":"12834", "some_other_keys":"respective values"}

Set2(“processId”:“p1”,“userId”:“user1”,“reportName”:“report2”):

{"reportName":"report2","processId":"p1","userId":"user1","threadId":"12434", "some_other_keys":"respective values"}*
{"processId":"p1","userId":"user1","reportName":"report2","threadId":"12934", "some_other_keys":"respective values"}
{"processId":"p1","userId":"user1","reportName":"report2","threadId":"12634", "some_other_keys":"respective values"}

Set3(“processId”:“p1”,“userId”:“user2”,“reportName”:“report1”):

{"threadId":"12734", "some_other_keys":"respective values", "processId":"p1","userId":"user2","reportName":"report1"}
{"processId":"p1","reportName":"report1","threadId":"12534", "some_other_keys":"respective values","userId":"user2"}

因此,一个函数返回三个集合(这可能或多或少也取决于匹配集合的数量)

我需要一个解决上述问题的方案,作为(a)性能高效的代码(b)行数较少的代码,因为我将处理大量的行。因此,我希望我的代码运行得更快,而且代码的行数应该更少

对于这个问题,我已经有了多个if条件和for循环的解决方案(我正在使用pythonjson解析json并获取元素)。但是我想要一个更高效的代码


Tags: 代码somekeysvaluesotheruseridp1user1
1条回答
网友
1楼 · 发布于 2024-10-01 17:28:15

IIUC,将itertools.groupbyoperator.itemgetter一起使用:

from operator import itemgetter
from itertools import groupby

keys = ["processId","userId","reportName"]

f = lambda x: itemgetter(*keys)(x)
srt = sorted(d, key=f)
for k, g in groupby(srt, key=f):
    print(k)
    print(list(g))

输出:

('p1', 'user1', 'report1')
[{'some_other_keys': 'respective values', 'threadId': '12234', 'userId': 'user1', 'processId': 'p1', 'reportName': 'report1'},
 {'some_other_keys': 'respective values', 'threadId': '12335', 'userId': 'user1', 'processId': 'p1', 'reportName': 'report1'}, 
 {'some_other_keys': 'respective values', 'threadId': '12834', 'userId': 'user1', 'processId': 'p1', 'reportName': 'report1'}]
('p1', 'user1', 'report2')
[{'some_other_keys': 'respective values', 'threadId': '12434', 'userId': 'user1', 'processId': 'p1', 'reportName': 'report2'}, 
 {'some_other_keys': 'respective values', 'threadId': '12934', 'userId': 'user1', 'processId': 'p1', 'reportName': 'report2'}, 
 {'some_other_keys': 'respective values', 'threadId': '12634', 'userId': 'user1', 'processId': 'p1', 'reportName': 'report2'}]
('p1', 'user2', 'report1')
[{'threadId': '12734', 'userId': 'user2', 'reportName': 'report1', 'processId': 'p1', 'some_other_keys': 'respective values'}, 
 {'some_other_keys': 'respective values', 'threadId': '12534', 'userId': 'user2', 'processId': 'p1', 'reportName': 'report1'}]

相关问题 更多 >

    热门问题