如何将数据帧转换为带有标题的多级JSON?

2024-10-01 02:21:29 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个pandas数据框架,我想将其转换为JSON格式,供我的源系统使用,这需要一个非常特定的JSON格式

我似乎无法使用简单的字典循环获得预期输出部分所示的确切格式

我是否可以将csv/pd.Dataframe转换为嵌套的JSON? 有专门为此构建的python包吗

输入数据帧:

 #Create Input Dataframe

data = {
        'col6':['A','A','A','B','B','B'],
        'col7':[1,  1,  2,  1,  2,  2],
        'col8':['A','A','A','B','B','B'],
        'col10':['A','A','A','B','B','B'],
        'col14':[1,1,1,1,1,2],
        'col15':[1,2,1,1,1,1],
        'col16':[9,10,26,9,12,4],
        'col18':[1,1,2,1,2,3],
        'col1':['xxxx','xxxx','xxxx','xxxx','xxxx','xxxx'],
        'col2':[2.02011E+13,2.02011E+13,2.02011E+13,2.02011E+13,2.02011E+13,2.02011E+13],
        'col3':['xxxx20201107023012','xxxx20201107023012','xxxx20201107023012','xxxx20201107023012','xxxx20201107023012','xxxx20201107023012'],
        'col4':['yyyy','yyyy','yyyy','yyyy','yyyy','yyyy'],
        'col5':[0,0,0,0,0,0],
        'col9':['A','A','A','B','B','B'],
        'col11':[0,0,0,0,0,0],
        'col12':[0,0,0,0,0,0],
        'col13':[0,0,0,0,0,0],
        'col17':[51,63,47,59,53,56]
        

        }
pd.DataFrame(data)

预期输出:

{
    "header1": {
                "col1": "xxxx"
                "col2": "20201107023012"
                "col3": "xxxx20201107023012"
                "col4": "yyyy",
                "col5": "0" 
                        },
    
    "header2": 
    
    {
    "header3": 
            [
                {
                    col6: A,
                    col7: 1,
                    header4: 
                    [
                                         {
                                            col8: "A", 
                                            col9: 1, 
                                            col10: "A",
                                            col11: 0,
                                            col12: 0,
                                            col13: 0, 
                                            
                                            "header5": 
                                                [
                                                        {
                                                            col14: "1", 
                                                            col15: 1,  
                                                            col16: 1, 
                                                            col17: 51,
                                                            col18: 1 
                                                        },
                                                        
                                                        {
                                                            col14: "1", 
                                                            col15: 1,  
                                                            col16: 2, 
                                                            col17: 63,
                                                            col18: 2
                                                        }
                                                ]
                                        },
                                        {
                                            col8: "A", 
                                            col9: 1, 
                                            col10: "A",
                                            col11: 0,
                                            col12: 0,
                                            col13: 0, 
                                            
                                            "header5": 
                                                [
                                                        {
                                                            col14: "1", 
                                                            col15: 1,  
                                                            col16: 1, 
                                                            col17: 51,
                                                            col18: 1 
                                                        },
                                                        
                                                        {
                                                            col14: "1", 
                                                            col15: 1,  
                                                            col16: 2, 
                                                            col17: 63,
                                                            col18: 2
                                                        }
                                                ]
                                        }
                    ]
                }
            ]
    }
}

Tags: json格式xxxxyyyycol8col14col11col9
1条回答
网友
1楼 · 发布于 2024-10-01 02:21:29

也许这会让你开始。我不知道当前有什么python模块可以满足您的需求,但这是我启动它的基础。根据您提供的内容做出假设

由于每个连续嵌套都基于某些条件,因此需要循环过滤数据帧。根据数据帧的大小,使用groupby可能是比我这里介绍的更好的选择,但理论是一样的。此外,您还必须正确地创建键值对,这只是创建了对您正在构建的数据的支持

    # assume header 1 is constant so take first row and use .T to transpose to create dictionaries
header1 = dict(df.iloc[0].T[['col1','col2','col3','col4','col5']])
print('header1', header1)
# for header three, looks like you need the unique combinations so create dataframe 
# and then iterate through to get all the header3 dictionaries
header3_dicts = []
dfh3 = df[['col6', 'col7']].drop_duplicates().reset_index(drop=True)
for i in range(dfh3.shape[0]):
    header3_dicts.append(dict(dfh3.iloc[i].T[['col6','col7']]))
    print('header3', header3_dicts)
    # iterate over header3 to get header 4
    for i in range(dfh3.shape[0]):
        #print(dfh3.iat[i,0], dfh3.iat[i,1])
        dfh4 = df.loc[(df['col6']==dfh3.iat[i,0]) & (df['col7']==dfh3.iat[i,1])]
        header4_dicts = []
        for j in range(dfh4.shape[0]):
            header4_dicts.append(dict(df.iloc[j].T[['col8','col9','col10','col11','col12','col13']]))
        print('header4', header4_dicts)
        # next level repeat similar to above

相关问题 更多 >