使用python将CSV转换为结构化嵌套JSON

2024-05-19 07:57:13 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试将平面结构csv转换为嵌套json结构

我有一些数据,如:

State   SubRegion       Postcode    Suburb
ACT South Canberra      2620    Oaks Estate
ACT North Canberra      2601    Acton
ACT North Canberra      2602    Ainslie
ACT Gungahlin-Hall      2914    Amaroo

我想要这样的期望输出:

[
       {
          "name":"ACT",
          "regions":[
             {
                "name":"South Canberra",
                "suburbs":[
                   {
                      "postcode":"2620",
                      "name":"Oaks Estate"
                   }
                ]
             },
             {
                "name":"North Canberra",
                "suburbs":[
                   {
                      "postcode":"2601",
                      "name":"Acton"
                   },
                   {
                      "postcode":"2602",
                      "name":"Ainslie"
                   }
                ]
             },
             {
                "name":"Gungahlin-Hall",
                "suburbs":[
                   {
                      "postcode":"2914",
                      "name":"Amaroo"
                   }
                ]
             }
          ]
       }
    ]

我正在尝试使用pandas和普通脚本获取此结构,但尚未获得正确的结构


Tags: name结构actpostcodesouthhallnorthestate
2条回答

我已经解决了这个问题。以下是解决方案:

def getindex(convertedList, value):
    ivd = -1
    for index, item in enumerate(convertedList):
        # print("line 7 : ", item, value)
        if item['name'] == value:
            ivd =  index
            break
        else:
            ivd = -1
    return ivd    
with open('Regions.csv', 'r') as file:
        reader = csv.reader(file)
        mainData = []
        loopIndex = 0
        for row in reader:
            if loopIndex > 0:
                index = getindex(mainData, row[0])
                if index > -1:
                    subindex = getindex(mainData[index]['regions'], row[1])
                    if subindex > -1:
                        suburbObj = {
                            'postcode' : row[3],
                            'name' : row[4]
                        }
                        mainData[index]['regions'][subindex]['suburbs'].append(suburbObj)
                    else :
                        regionObj = {
                            "name" : row[1],
                            "suburbs" : [{
                                "name" : row[4],
                                "postCode" : row[3]
                            }]
                        }
                        mainData[index]['regions'].append(regionObj)
                else :                
                    stateObj = {
                        'name' : row[0],
                        'regions' : [{
                            "name" : row[1],
                            "suburbs" : [{
                                "name" : row[4],
                                "postCode" : row[3]
                            }] 
                        }]
                    }
                    mainData.append(stateObj)
            loopIndex = loopIndex + 1  

如果有人有更好的优化代码,你可以发布你的解决方案

谢谢

我认为这应该行得通

import csv
import json 

def add_new_region(name, postcode, name2):
    d = {"name" : name,
     "suburbs" : [add_suburb(postcode, name2)]
     }
    return d
    
def add_suburb(postcode, name):
    return {"postcode" :  postcode,
              "name" : name
              }
    
datalist=[]
region_dict={}
region_dict_counter = 0
with open("data.csv", "r") as f:
    data = csv.reader(f)
    next(data) # skip headers
    for row in data:
        if row[0] in region_dict.keys():
            for x in (datalist[region_dict[row[0]]])["regions"]:
                if x["name"] == row[1]:
                    (x["suburbs"]).append(add_suburb(row[2], row[3]))
                    break
            else :
                datalist[region_dict[row[0]]]["regions"].append(add_new_region(row[1], row[2], row[3]))
                    
        else:
            d = { "name" : row[0],
                 "regions" : [ add_new_region(row[1], row[2], row[3])]}
            datalist.append(d)
            region_dict[row[0]] = region_dict_counter
            region_dict_counter+=1
json_data=json.dumps(datalist, indent=4)
print(json_data)
with open("data.json", "w") as j:
    j.write(json_data)

相关问题 更多 >

    热门问题