Python数据帧到JSON(多级)

2024-06-02 20:39:19 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个包含以下列的python panda数据帧:

   CUSTOMER_ID PRODUCT_ID VENDOR_ID         DAT        ORDER_ID COLOR_ID  
0     10078229  508136536       450  2018-11-23  20183200576771     1000   
1     10078229  508136532       450  2018-11-23  20183200576771     1000   
2     10202280  506894206       450  2018-11-23  20183231461778     1000   
3     10207584  500970872      2097  2018-11-23  20183231430937     1002   
4     10207584  500970872      2097  2018-11-23  20183231430937     1000   
5     10268028  511131122       450  2018-11-23  20183231418341     1000   
6     10268028  509736876       450  2018-11-23  20183231418341     1000   
7     10268028  507095754       450  2018-11-23  20183231418341     1000   
8     10268028  513902792       450  2018-11-23  20183231418341     1000   
9     10383692  508229004       450  2018-11-23  20183190670154     1000

我希望JSON格式的输出如下:

[{
        "CUSTOMER_ID": "10078229",
        "PRODUCT": [{
            "PRODUCT_ID": "508136536",
            "VENDOR_ID": "450",
            "DAT": "2018-11-23",
            "ORDER_ID": "20183200576771",
            "COLOR_ID": "1000",
            "SIZE_ID": "1000"
        }, {
            "PRODUCT_ID": "508136532",
            "VENDOR_ID": "450",
            "DAT": "2018-11-23",
            "ORDER_ID": "20183200576771",
            "COLOR_ID": "1000",
            "SIZE_ID": "1002"
        }]
    },
    {
        "CUSTOMER_ID": "10202280",
        "PRODUCT": [{
            "PRODUCT_ID": "506894206",
            "VENDOR_ID": "450",
            "DAT": "2018-11-23",
            "ORDER_ID": "20183231461778",
            "COLOR_ID": "1000",
            "SIZE_ID": "1000"
        }]
    }
]

我试过了,但如果没有危险的连接,从现在开始就不会成功。 这是我的代码:

df_cre=pd.DataFrame()
ids=df_test["CUSTOMER_ID"].unique()

for i in ids:
    df2=df_test[df_test["CUSTOMER_ID"]== i]
    df2=df2.drop('CUSTOMER_ID',1)
    js2="{\"CUSTOMER_ID\": \""+str(i)+"\",\"PRODUCTS\" :" + df2.to_json(orient='records', lines=False) + "}"
    df_cre=df_cre.append(pd.DataFrame([[i,js2]], columns=('CUSTOMER_ID','KEY_EVENT')))



json_final='['
for row in df_cre.itertuples():
    json_final+= row.KEY_EVENT +','

json_final=json_final[:-1]    
json_final+= ']'

有没有一种方法可以使用函数来实现这一点?你知道吗

非常感谢

编辑:我想我的输出的形状(3级:客户,订单,(产品和供应商),你会怎么做?你知道吗

[
    {
    "CUSTOMER_ID": 10078229,
    "ORDER" : [
        {
        "ORDER_ID": 20183200576771,
        "DAT": "2018-11-23",
        "PRODUCT": [
            {
            "PRODUCT_ID": 508136536,
            "COLOR_ID": 1000,
            "SIZE_ID" : 1002
            },
            {
            "PRODUCT_ID": 508136532,
            "COLOR_ID": 1000,
            "SIZE_ID" : 1003
            }
                ],
        "VENDOR": [
            {
            "VENDOR_ID" : 1234
            },
            {
            "VENDOR_ID" : 12345
            }    ]
        },
        {
        "ORDER_ID" : 2222 ...
        }   ]
    }
    , "CUSTOMER_ID" : 12345 ....
 ]

谢谢你


Tags: testidjsondfsizeordercustomerproduct
3条回答
result = [{"CUSTOMER_ID":name,"PRODUCT":group[['PRODUCT_ID','VENDOR_ID','DAT','ORDER_ID','COLOR_ID']].to_dict("records")} for name,group in df.groupby('CUSTOMER_ID')] 

打印(结果),这会有帮助。你知道吗

像这样的?你知道吗

df2 = df.groupby("CUSTOMER_ID")['PRODUCT_ID', 'VENDOR_ID', 'DAT', 'ORDER_ID','COLOR_ID'].apply(lambda x: x.to_dict(orient="records")).reset_index(name="PRODUCT").to_json(orient="records")

输出:

[
  {
    "CUSTOMER_ID": 10078229,
    "PRODUCT": [
      {
        "PRODUCT_ID": 508136536,
        "VENDOR_ID": 450,
        "DAT": "2018-11-23",
        "ORDER_ID": 20183200576771,
        "COLOR_ID": 1000
      },
      {
        "PRODUCT_ID": 508136532,
        "VENDOR_ID": 450,
        "DAT": "2018-11-23",
        "ORDER_ID": 20183200576771,
        "COLOR_ID": 1000
      }
    ]
  },
  {
    "CUSTOMER_ID": 10202280,
    "PRODUCT": [
      {
        "PRODUCT_ID": 506894206,
        "VENDOR_ID": 450,
        "DAT": "2018-11-23",
        "ORDER_ID": 20183231461778,
        "COLOR_ID": 1000
      }
    ]
  },
  {
    "CUSTOMER_ID": 10207584,
    "PRODUCT": [
      {
        "PRODUCT_ID": 500970872,
        "VENDOR_ID": 2097,
        "DAT": "2018-11-23",
        "ORDER_ID": 20183231430937,
        "COLOR_ID": 1002
      },
      {
        "PRODUCT_ID": 500970872,
        "VENDOR_ID": 2097,
        "DAT": "2018-11-23",
        "ORDER_ID": 20183231430937,
        "COLOR_ID": 1000
      }
    ]
  },
  {
    "CUSTOMER_ID": 10268028,
    "PRODUCT": [
      {
        "PRODUCT_ID": 511131122,
        "VENDOR_ID": 450,
        "DAT": "2018-11-23",
        "ORDER_ID": 20183231418341,
        "COLOR_ID": 1000
      },
      {
        "PRODUCT_ID": 509736876,
        "VENDOR_ID": 450,
        "DAT": "2018-11-23",
        "ORDER_ID": 20183231418341,
        "COLOR_ID": 1000
      },
      {
        "PRODUCT_ID": 507095754,
        "VENDOR_ID": 450,
        "DAT": "2018-11-23",
        "ORDER_ID": 20183231418341,
        "COLOR_ID": 1000
      },
      {
        "PRODUCT_ID": 513902792,
        "VENDOR_ID": 450,
        "DAT": "2018-11-23",
        "ORDER_ID": 20183231418341,
        "COLOR_ID": 1000
      }
    ]
  },
  {
    "CUSTOMER_ID": 10383692,
    "PRODUCT": [
      {
        "PRODUCT_ID": 508229004,
        "VENDOR_ID": 450,
        "DAT": "2018-11-23",
        "ORDER_ID": 20183190670154,
        "COLOR_ID": 1000
      }
    ]
  }
]

这将起作用:

print([{'CUSTOMER_ID ': x['CUSTOMER_ID'],
        'PRODUCT': {k: v for k, v in x.items() if k != 'CUSTOMER_ID'}}
       for x in df.to_dict('records')])

相关问题 更多 >