JSON到Python Pandas datafram

[ { "ID": "12345", "productName": "Product A ", "minPrice": "$89.00", "maxPrice": "$89.00", "variationProducts": [ { "variantColor": "JJ0BVE7", "variantSize": "080", "sellingPrice": "$89.00", "inventory": 3, }, { "variantColor": "JJ0BVE7", "variantSize": "085", "sellingPrice": "$89.00", "inventory": 6, } ], "productAttributes": [ { "ID": "countryOfOrigin", "value": "Imported" }, { "ID": "csProductCode", "value": "1100" } ] }, { "ID": "23456", "productName": "Product B", "minPrice": "$29.99", "maxPrice": "$69.00", "variationProducts": [ { "variantColor": "JJ169Q0", "variantSize": "050", "sellingPrice": "$69.00", "inventory": 55, }, { "variantColor": "JJ123Q0", "variantSize": "055", "sellingPrice": "$69.00", "inventory": 5, } ], "productAttributes": [ { "ID": "countryOfOrigin", "value": "Imported" }, { "ID": "csProductCode", "value": "1101" } ] } ]

3条回答

网友

1楼 · 编辑于 2024-10-04 01:29:51

我认为您只需对数据进行少量解析，就可以将其转换为正确的格式，以便read_json能够正常工作。在

首次使用json.load文件（文件名）将json数据放入一个python对象，该对象将被列出。在

现在您需要转换这个列表，使每个对象都是dictionary，并且每个dictionary都有键作为列名，值作为您希望在该列中使用的值。在

一个你已经准备好的清单，然后你可以使用熊猫.DataFrame（列表）

网友

2楼 · 编辑于 2024-10-04 01:29:51

您可以使用^{}：

In [11]: pd.io.json.json_normalize(d, "variationProducts", ["ID", "maxPrice", "minPrice", "productName"], record_prefix=".")
Out[11]:
   .inventory .sellingPrice .variantColor .variantSize     ID maxPrice minPrice productName
0           3        $89.00       JJ0BVE7          080  12345   $89.00   $89.00  Product A
1           6        $89.00       JJ0BVE7          085  12345   $89.00   $89.00  Product A
2          55        $69.00       JJ169Q0          050  23456   $69.00   $29.99   Product B
3           5        $69.00       JJ123Q0          055  23456   $69.00   $29.99   Product B

In [12]: pd.io.json.json_normalize(d, "productAttributes", ["ID", "maxPrice", "minPrice", "productName"], record_prefix=".")
Out[12]:
               .ID    .value     ID maxPrice minPrice productName
0  countryOfOrigin  Imported  12345   $89.00   $89.00  Product A
1    csProductCode      1100  12345   $89.00   $89.00  Product A
2  countryOfOrigin  Imported  23456   $69.00   $29.99   Product B
3    csProductCode      1101  23456   $69.00   $29.99   Product B

然后您可以将这两个合并在一起。。。

网友

3楼 · 编辑于 2024-10-04 01:29:51

l = [
  {
    "ID": "12345",
    "productName": "Product A ",
    "minPrice": "$89.00",
    "maxPrice": "$89.00",
    "variationProducts": [
      {
        "variantColor": "JJ0BVE7",
        "variantSize": "080",
        "sellingPrice": "$89.00",
        "inventory": 3,
      },
      {
        "variantColor": "JJ0BVE7",
        "variantSize": "085",
        "sellingPrice": "$89.00",
        "inventory": 6,
      }
    ],
    "productAttributes": [
        {
        "ID": "countryOfOrigin",
        "value": "Imported"
      },
      {
        "ID": "csProductCode",
        "value": "1100"
      }
    ]
  },
  {
    "ID": "23456",
    "productName": "Product B",
    "minPrice": "$29.99",
    "maxPrice": "$69.00",
    "variationProducts": [
      {
        "variantColor": "JJ169Q0",
        "variantSize": "050",
        "sellingPrice": "$69.00",
        "inventory": 55,
      },
      {
        "variantColor": "JJ123Q0",
        "variantSize": "055",
        "sellingPrice": "$69.00",
        "inventory": 5,
      }
    ],
   "productAttributes": [
        {
        "ID": "countryOfOrigin",
        "value": "Imported"
      },
      {
        "ID": "csProductCode",
        "value": "1101"
      }
    ]
  }
]


import pandas as pd
from itertools import *

final_list = []
for val in l:
    d = {}
    d.update({key:val[key] for key in val.keys() if key not in ['variationProducts','productAttributes']})
    for prods,attrs in izip_longest(val['variationProducts'],val['productAttributes']):
        if prods:
            d.update(prods)
        if attrs:
            d.update({attrs['ID']:attrs['value']})
        final_list.append(d.copy())

pd.DataFrame(final_list)

相关问题更多 >

编程相关推荐

热门问题

热门文章