如何从Python中的Pandas数据帧创建嵌套JSON文件?

2024-10-04 11:34:50 发布

您现在位置:Python中文网/ 问答频道 /正文

我的数据框架如下所示:

 df.head(2)

Ord MOT  MVT  CUST  CreationSla  CreationPlanned CreationProposed  PickupSla  PickupPlanned PickupProposed
 12  TR    TT   DEA  12-3-2020      12-3-2020      12-3-2020       14-3-2020   14-3-2020    14-3-2020
 15  ZR    TD   DET  15-3-2020      15-3-2020      15-3-2020       16-3-2020   16-3-2020    16-3-2020

我想创建以下格式的嵌套JSON文件:

预期产量

{
    "Ord" : "12",
    "MOT" : "TR",
    "MVT" : "TT",
    "CUST" : "DEA",
    "milestone" : {
        "creation" : {
            "sla" : "12-3-2020",
            "plan" : "12-3-2020",
            "proposed" : "12-3-2020"
        },
        "Pickup" : {
            "sla" : "14-3-2020",
            "plan" : "14-3-2020",
            "proposed" : "14-3-2020"
        }
    }
}

如何在Python中实现这一点


Tags: 数据框架dfheadtrttplanmot
3条回答

首先,您需要在数据帧的行上循环

然后对每一行创建一个字典。整个想法是这样的:

result = []
predefined_columns = ['Creation', 'Pickup', 'Departure']
mapping_cl = []
for column in df.columns:
    flag = False
    for sub_str in predefined_columns:
        if sub_str in column:
            flag = True
            mapping_cl.append(sub_str)
            break
    if not flag:
        mapping_cl.append(False)
for index, row in df.iterrows():
    item = {}
    for cl in mapping_cl:
        if cl:
            item[cl] = {}
    for i, column in enumerate(df.columns):
        if mapping_cl[i]:
            cl_name = column.split(mapping_cl[i])[-1]
            item[mapping_cl[i]][cl_name] = row[column]
        else:
            item[column] = row[column]
    result.append(item)

现在result是您想要的dict列表:

Enter image description here

您可以创建JSON模板并向其发送数据:

d = """{
    "Ord" : "%s",
    "MOT" : "%s",
    "MVT" : "%s",
    "CUST" : "%s",
    "milestone" : {
        "creation" : {
            "sla" : "%s",
            "plan" : "%s",
            "proposed" : "%s"
        },
        "Pickup" : {
            "sla" : "%s",
            "plan" : "%s",
            "proposed" : "%s"
        }
    }
}
"""
js = []

for item in df.values:
    js.append(json.loads(d%tuple(item.tolist())))

print(json.dumps(js))

输出:

[{"Ord": "a", "MOT": "TR", "MVT": "TT", "CUST": "DEA", "milestone": {"creation": {"sla": "12-3-2020", "plan": "12-3-2020", "proposed": "12-3-2020"}, "Pickup": {"sla": "14-3-2020", "plan": "14-3-2020", "proposed": "14-3-2020"}}}, {"Ord": "b", "MOT": "ZR", "MVT": "TD", "CUST": "DET", "milestone": {"creation": {"sla": "15-3-2020", "plan": "15-3-2020", "proposed": "15-3-2020"}, "Pickup": {"sla": "16-3-2020", "plan": "16-3-2020", "proposed": "16-3-2020"}}}]

因为您提到了熊猫,所以我使用wide_to_long,然后使用groupby来创建您的格式。请注意,这要求您在数据格式更改时更改level

s=pd.wide_to_long(df,stubnames=['Creation','Pickup'],i=['Ord', 'MOT', 'MVT', 'CUST'],j='type' , suffix='\w+').stack().unstack(level=-2)
js=[{**dict(zip(s.index.names[:-1],x)),**{'milestone' : y.reset_index(level=[0,1,2,3],drop=True).to_dict('i') }} for x , y in s.groupby(level=[0,1,2,3])]
js
[{'Ord': 12, 'MOT': 'TR', 'MVT': 'TT', 'CUST': 'DEA',
 'milestone':
  {'Creation':
  {'Planned': '12-3-2020', 'Proposed': '12-3-2020', 'Sla': '12-3-2020'}, 'Pickup': {'Planned': '14-3-2020', 'Proposed': '14-3-2020', 'Sla': '14-3-2020'}}},
 {'Ord': 15, 'MOT': 'ZR', 'MVT': 'TD', 'CUST': 'DET',
  'milestone':
  {'Creation':
  {'Planned': '15-3-2020', 'Proposed': '15-3-2020', 'Sla': '15-3-2020'}, 'Pickup': {'Planned': '16-3-2020', 'Proposed': '16-3-2020', 'Sla': '16-3-2020'}}}]

相关问题 更多 >