Python和嵌套JSON

2024-10-04 01:32:27 发布

您现在位置:Python中文网/ 问答频道 /正文

正在尝试将嵌套的JSON转换为当前数据帧的列。我如何做到这一点?注意:JSON函数字典列表重复600多次

{
"Functions": [
    {
        "CodeSha256": "",
        "CodeSize": 
        "Description": "",
        "Environment": {
            "Variables": {

                "COMMIT_HASH": ",
                "CodeSha256": "",
                "ECS_LOG_STREAM": "",
                "ELASTIC_SEARCH_DOMAIN_ENDPOINT": "",
                "ENVIRONMENT": "prod",
                "SERVICE_NAME": "testingservicename",
                "SERVICE_OWNER": "testingserviceowner",

            }
        },
        
        "FunctionName": "demofunctionname",
        "Timeout": ,
        "TracingConfig": {
            "Mode": 
        },
        "Version": "",
        "VpcConfig": {
            "SecurityGroupIds": [
                ""
            ],
            "SubnetIds": [
                "",
                "",
                ""
            ],
            "VpcId": ""
        }
    }

] }

如何加载json

data = json.load(open('../fileservice.json'))
df = pd.DataFrame(data["Functions"])

它是怎么出现的

    FunctionName                                    Environment
0 demofunctionname      {Variables{"COMMIT_HASH":"djkdkd","SERVICE_OWNER":"serviceownertest"}}                 

我多么需要它出现

        FunctionName                               COMMIT_HASH       SERVICE_OWNER      
0      demofunctionname                            djkdkd             serviceownertest

我一直在尝试突破这个方法,但没有完成任务。非常感谢您的任何建议或指导


Tags: jsondataenvironmentservicehashvariablesfunctionstesting
2条回答

看看pd.json_normalize()。这是一个非常好的工具。就你而言:

pd.json_normalize(s["Functions"])

将给出以下输出(仅转换为第一行):

CodeSha256                                                             
CodeSize                                                               
Description                                                            
FunctionName                                           demofunctionname
Timeout                                                                
Version                                                                
Environment.Variables.COMMIT_HASH                                  test
Environment.Variables.CodeSha256                                       
Environment.Variables.ECS_LOG_STREAM                                   
Environment.Variables.ELASTIC_SEARCH_DOMAIN_END...                     
Environment.Variables.ENVIRONMENT                                  prod
Environment.Variables.SERVICE_NAME                   testingservicename
Environment.Variables.SERVICE_OWNER                 testingserviceowner
TracingConfig.Mode                                                     
VpcConfig.SecurityGroupIds                                           []
VpcConfig.SubnetIds                                              [, , ]
VpcConfig.VpcId                                                        

我认为这不是最好的办法。但这是解决办法

s={
    "Functions": [
    {
        "CodeSha256": "",
        "CodeSize": "",
        "Description": "",
        "Environment": {
            "Variables": {
                "COMMIT_HASH": "test",
                "CodeSha256": "",
                "ECS_LOG_STREAM": "",
                "ELASTIC_SEARCH_DOMAIN_ENDPOINT": "",
                "ENVIRONMENT": "prod",
                "SERVICE_NAME": "testingservicename",
                "SERVICE_OWNER": "testingserviceowner",

            }
        },
        
        "FunctionName": "demofunctionname",
        "Timeout": "" ,
        "TracingConfig": {
            "Mode": ""
        },
        "Version": "",
        "VpcConfig": {
            "SecurityGroupIds": [
                ""
            ],
            "SubnetIds": [
                "",
                "",
                ""
            ],
            "VpcId": ""
        }
    }
] }

import pandas as pd
import json

s = json.dumps(s)
data = json.loads(s)

result={'FunctionName': data["Functions"][0]["FunctionName"], 'COMMIT_HASH': data["Functions"][0]["Environment"]["Variables"]["COMMIT_HASH"], 'SERVICE_OWNER': data["Functions"][0]["Environment"]["Variables"]["SERVICE_OWNER"]}

df = pd.DataFrame(data= [result])
print(df)

注意:我认为你的json有一些问题,我已经在我的解决方案中解决了

编辑版本

下面是多个“函数”的代码

s={
    "Functions": [
        {
            "CodeSha256": "",
            "CodeSize": "",
            "Description": "",
            "Environment": {
                "Variables": {
                    "COMMIT_HASH": "test",
                    "CodeSha256": "",
                    "ECS_LOG_STREAM": "",
                    "ELASTIC_SEARCH_DOMAIN_ENDPOINT": "",
                    "ENVIRONMENT": "prod",
                    "SERVICE_NAME": "testingservicename",
                    "SERVICE_OWNER": "testingserviceowner",
    
                }
            },
            
            "FunctionName": "demofunctionname",
            "Timeout": "" ,
            "TracingConfig": {
                "Mode": ""
            },
            "Version": "",
            "VpcConfig": {
                "SecurityGroupIds": [
                    ""
                ],
                "SubnetIds": [
                    "",
                    "",
                    ""
                ],
                "VpcId": ""
            }
        },
        {
            "CodeSha256": "",
            "CodeSize": "",
            "Description": "",
            "Environment": {
                "Variables": {
                    "COMMIT_HASH": "test",
                    "CodeSha256": "",
                    "ECS_LOG_STREAM": "",
                    "ELASTIC_SEARCH_DOMAIN_ENDPOINT": "",
                    "ENVIRONMENT": "prod",
                    "SERVICE_NAME": "testingservicename",
                    "SERVICE_OWNER": "testingserviceowner",
    
                }
            },
            
            "FunctionName": "demofunctionname 1",
            "Timeout": "" ,
            "TracingConfig": {
                "Mode": ""
            },
            "Version": "",
            "VpcConfig": {
                "SecurityGroupIds": [
                    ""
                ],
                "SubnetIds": [
                    "",
                    "",
                    ""
                ],
                "VpcId": ""
            }
        }
    ]
}

import pandas as pd
import json

s = json.dumps(s)
data = json.loads(s)

function_name_list=[]
commit_hash_list=[]
service_owner_list=[]

for i in range(len(data["Functions"])):
    function_name_list.append(data["Functions"][i]["FunctionName"])
    commit_hash_list.append(data["Functions"][i]["Environment"]["Variables"]["COMMIT_HASH"])
    service_owner_list.append(data["Functions"][i]["Environment"]["Variables"]["SERVICE_OWNER"])

result={'FunctionName': function_name_list, 'COMMIT_HASH': commit_hash_list, 'SERVICE_OWNER': service_owner_list}

df = pd.DataFrame(list(zip(function_name_list, commit_hash_list, service_owner_list)),
               columns =['FunctionName', 'COMMIT_HASH', 'SERVICE_OWNER'])
print(df)

相关问题 更多 >