将数据帧或csv文件转换为自定义嵌套JSON

2024-10-06 11:18:06 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个带有DF的csv文件,其结构如下:

我的数据帧:

enter image description here

我想使用python将数据输入到以下JSON格式。我查看了几个链接(但我在嵌套部分中迷失了方向)。我检查的链接:

How to convert pandas dataframe to uniquely structured nested json

convert dataframe to nested json

"PHI": 2,
"firstname": "john",
"medicalHistory": {
  "allergies": "egg",
  
"event": {
    "inPatient":{
        "hospitalized": {
        "visit" : "7-20-20",
        "noofdays": "5",
         "test": {
            "modality": "xray"   
        } 
        "vitalSign": {
    "temperature": "32",
        "heartRate": "80"
  
  },
 "patientcondition": {
        "headache": "1",
        "cough": "0"
  }
        },
        "icu": {
            "visit" : "",
          "noofdays": "",
        },
    },
    "outpatient": {
        "visit":"5-20-20",
        "vitalSign": {
   "temperature": "32",
        "heartRate": "80"
  },
  "patientcondition": {
        "headache": "1",
        "cough": "1"
  },
  "test": {
            "modality": "blood"   
        }    
  }
    }

}

如果有人能帮我处理嵌套数组,那将非常有帮助


Tags: to数据testjsonconvertdataframe链接visit
1条回答
网友
1楼 · 发布于 2024-10-06 11:18:06

您需要一个或多个helper函数来像这样解压表中的数据。编写main helper函数以接受两个参数:1。df和2。模式。该模式将用于将df解包为df中每一行的嵌套结构。下面的模式是如何为您描述的逻辑子集实现这一点的示例。虽然不完全是您在示例中指定的内容,但应该足够提示您自己完成任务的其余部分

from operator import itemgetter
groupby_idx = ['PHI', 'firstName']
groups = df.groupby(groupby_idx, as_index=False, drop=False)
schema = {
    "event": {
        "eventType": itemgetter('event'), 
        "visit": itemgetter('visit'),
        "noOfDays": itemgetter('noofdays'),
        "test": {
            "modality": itemgetter('test')
        },
        "vitalSign": {
            "temperature": itemgetter('temperature'),
            "heartRate": itemgetter('heartRate')
        },
        "patientCondition": {
            "headache": itemgetter('headache'),
            "cough": itemgetter('cough')
        }
    }
}

def unpack(obj, schema):
    tmp = {}
    for k, v in schema.items():
        if isinstance(v, (dict,)):
            tmp[k] = unpack(obj, v)
        if callable(v):
            tmp[k] = v(obj)
    return tmp

def apply_unpack(groups, schema):
    results = {}
    for gidx, df in groups:
        events = []
        for ridx, obj in df.iterrows():
            d = unpack(obj, schema)
            events.append(d)
        results[gidx] = events
    return results

unpacked = apply_unpack(groups, schema)

相关问题 更多 >