从Python中的日志文件中提取特定的JSON

2024-09-29 17:22:40 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图从包含多个JSON和普通文本的日志文件中提取特定的JSON,在本例中,我试图提取包含“输出负载”文本的JSON。我尝试了多种方法,但无法提取所需的JSON,文件格式如下:

[2020-05-17 15:32:11.698000] INFO [worker-1] org.mule.api.processor.LoggerMessageProcessor [[cloudhub-us-claim-services-1-0-0-prod].post:/claims/{claimNumber}/predictionScores:experience-claims-predictionscore-api.config.7.771]: PredictionScoreAPILogger-7c506940-987d-11ea-9ef4-0a5226a8e24f:16634746: Initialization: Request successfully logged to mirror queue
[2020-05-17 15:32:12.190000] INFO [worker-1] org.mule.transformer.simple.MessagePropertiesTransformer [[cloudhub-us-claim-services-1-0-0-prod].experience-claims-predictionscore-api.prediction-details-claim-updates.stage1.839]: Property with key 'response', not found on message using 'null'. Since the value was marked optional, nothing was set on the message for this property
[2020-05-17 15:32:12.192000] DEBUG [worker-1] aiml.logging.debug [[cloudhub-us-claim-services-1-0-0-prod].experience-claims-predictionscore-api.prediction-details-claim-updates.stage1.839]: PredictionScoreAPILogger-7c506940-987d-11ea-9ef4-0a5226a8e24f:16634746:Datarobot API Call: Output payload received from Datarobot API: {
  "prediction": "N",
  "predictionScore": 0.0000629713,
  "predictionExplanations": "lineItem : 0|feature: ADJER_CANNOT_COMPUTE_TWG_SUGGESTED_TIME_ZERO|Value: Y|strength: -1.4469371757,\nlineItem : 1|feature: ADJER_CANNOT_COMPUTE_TWG_SUGGESTED_PRICE|Value: Y|strength: -1.1968554807,\nlineItem : 2|feature: MONTHS_DIFF_CLAIM_REPAIR_FACILITY_FIRST_CLAIM|Value: 61|strength: -1.0681064444"
}

Tags: apijsonvalueserviceprodstrengthfeatureexperience
1条回答
网友
1楼 · 发布于 2024-09-29 17:22:40

您可能会将该文件作为文本读取,然后使用正则表达式对其进行解析。大概是这样的:

import re

logfile = open(logfilepath, 'r')
log = logfile.read()
logfile.close()
objects = re.findall("(Output payload.*:\s?)(\{\s?[\s\S]+?\s?\})", log)

我已经测试了您给定样本的正则表达式,它运行良好。因此,这段代码也应该起作用。一旦获得所有JSON对象,就可以很容易地找到要查找的对象

快乐黑客:)

编辑:根据修改后的问题修改正则表达式。正则表达式现在查找“输出有效负载”字符串

相关问题 更多 >

    热门问题