如何将具有相同数据的嵌套json转换为CSV python

2024-10-17 08:36:35 发布

您现在位置:Python中文网/ 问答频道 /正文

我有包含相同键名的嵌套数组的json数据。 我的Json格式如下(可能会改变键值对名称):

{
  "name": "bharat",
  "age": 27, 
  "vehicles": [
      {
          "car": "tata",
          "bike": "duke",
          "plane": "n",            
      },
      {
          "car": "odi",
          "bike": "duke",
          "plane": "n",       
      }]
 }

我试过了 Convert nested JSON to CSV file in Python 但车上有多列钥匙相同。你知道吗

我的密码是

import json
import csv
from elasticsearch import Elasticsearch
import elasticsearch.helpers

with open("query.json") as f:
  query=json.load(f)

es = Elasticsearch(['http://xx.xx.xx.xx:xxxx'],verify_certs=False)

results_gen = elasticsearch.helpers.scan(
    es,
    query=query,
    index="demo",
)

def get_leaves(item, key=None):
    if isinstance(item, dict):
        leaves = []

        for i in item.keys():
            leaves.extend(get_leaves(item[i], i))
        return leaves
    elif isinstance(item, list):
        leaves = []
        for i in item:
            leaves.extend(get_leaves(i, key))
        return leaves
    else:
        return [(key, item)]

with  open('Data.csv', 'w', newline='') as f_output:
    csv_output = csv.writer(f_output)
    write_header = True

    for entry in results_gen:
        e=entry['_source']

        leaf_entries = sorted(get_leaves(e))
    print(leaf_entries)
        if write_header:
            csv_output.writerow([k for k, v in leaf_entries])
            write_header = False
        csv_output.writerow([v for k, v in leaf_entries])

我得到的结果是

name    age    car    car  bike   bike plane plane
bharat  27     tata   odi  duke    duke  n     n

我希望输出像

name    age    car    bike    plane
bharat  27     tata   duke    n     
bharat  27     odi    duke    n

Tags: csvinimportjsonforoutputitemquery
2条回答

类似这样的内容(假设您只对车辆数据感兴趣)

data = {"name": "bharat", "age": 27, "vehicles": [{"car": "tata", "bike": "duke", "plane": "n",
                                                   }, {"car": "odi", "bike": "duke", "plane": "n",
                                                       }]}
with open('out.csv', 'w') as out:
    for v in data['vehicles']:
        line = ','.join(v.values())
        out.write(line + '\n')

您可以使用Pandas Normalize类将json规范化为Pandas dataframe,然后将dataframe写入csv

假设变量'data'中有Json字典,比如:[{json1},{json2},{json3}…]

    import json
    import pandas
    from pandas.io.json import json_normalize

    for d in data:
        normalized_data = json_normalize(d)
        normalized_data .to_csv('csv_file_name', sep='|', mode='a', index=False, na_rep='', header=False)

参考:https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.io.json.json_normalize.html

相关问题 更多 >