将数据帧或csv文件转换为自定义嵌套JSON

"PHI": 2, "firstname": "john", "medicalHistory": { "allergies": "egg", "event": { "inPatient":{ "hospitalized": { "visit" : "7-20-20", "noofdays": "5", "test": { "modality": "xray" } "vitalSign": { "temperature": "32", "heartRate": "80" }, "patientcondition": { "headache": "1", "cough": "0" } }, "icu": { "visit" : "", "noofdays": "", }, }, "outpatient": { "visit":"5-20-20", "vitalSign": { "temperature": "32", "heartRate": "80" }, "patientcondition": { "headache": "1", "cough": "1" }, "test": { "modality": "blood" } } } }

1条回答

网友

1楼 · 发布于 2024-10-06 11:18:06

您需要一个或多个helper函数来像这样解压表中的数据。编写main helper函数以接受两个参数：1。df和2。模式。该模式将用于将df解包为df中每一行的嵌套结构。下面的模式是如何为您描述的逻辑子集实现这一点的示例。虽然不完全是您在示例中指定的内容，但应该足够提示您自己完成任务的其余部分

from operator import itemgetter
groupby_idx = ['PHI', 'firstName']
groups = df.groupby(groupby_idx, as_index=False, drop=False)
schema = {
    "event": {
        "eventType": itemgetter('event'), 
        "visit": itemgetter('visit'),
        "noOfDays": itemgetter('noofdays'),
        "test": {
            "modality": itemgetter('test')
        },
        "vitalSign": {
            "temperature": itemgetter('temperature'),
            "heartRate": itemgetter('heartRate')
        },
        "patientCondition": {
            "headache": itemgetter('headache'),
            "cough": itemgetter('cough')
        }
    }
}

def unpack(obj, schema):
    tmp = {}
    for k, v in schema.items():
        if isinstance(v, (dict,)):
            tmp[k] = unpack(obj, v)
        if callable(v):
            tmp[k] = v(obj)
    return tmp

def apply_unpack(groups, schema):
    results = {}
    for gidx, df in groups:
        events = []
        for ridx, obj in df.iterrows():
            d = unpack(obj, schema)
            events.append(d)
        results[gidx] = events
    return results

unpacked = apply_unpack(groups, schema)

相关问题更多 >

编程相关推荐

热门问题

热门文章