json规范化和多个记录路径值有问题

2024-06-02 16:44:46 发布

您现在位置:Python中文网/ 问答频道 /正文

如果以前有人问过这个问题,我很抱歉。我是pandas的新手,我正试图用json_normalize()将嵌套的API响应展平为表格格式。我在弄清楚如何在in-record\u路径中放置不同的嵌套时遇到问题参数。我的当前代码继续显示result = result[spec]KeyError: 'Type'

我有点不知所措,不知道下一步该怎么做或去哪里看。非常感谢。在

期望输出:

  Count     Metric  Title   Platform   Begin_Date   End_Date        Type         Value
   1    Total_Req   AACN      OVID     2019-01-01   2019-02-28    Print_ISSN  1234-5678

代码剪贴:

^{pr2}$

JSON Snip

    {
   'Title':'AACN Advanced Critical Care',
   'Item_ID':[
      {
         'Type':'Print_ISSN',
         'Value':'1559-7768'
      },
      {
         'Type':'Proprietary',
         'Value':'Ovid:01256961'
      }
   ],
   'Platform':'OvidMD',
   'Publisher':'American Association of Critical Care Nurses',
   'Publisher_ID':[
      {
         'Type':'Proprietary',
         'Value':'Ovid:21790'
      }
   ],
   'Performance':[
      {
         'Period':{
            'Begin_Date':'2019-02-01',
            'End_Date':'2019-02-28'
         },
         'Instance':[
            {
               'Metric_Type':'Total_Item_Requests',
               'Count':1
            },
            {
               'Metric_Type':'Unique_Item_Requests',
               'Count':1
            }
         ]
      }
   ]
}

Tags: 代码datetitlevaluetypecountresultitem
1条回答
网友
1楼 · 发布于 2024-06-02 16:44:46

由于预期输出与示例json snip不匹配,您可能需要对此进行一些操作,但这可能会让您继续:

data =json.loads(response.text)

def flatten_json(y):
    out = {}
    def flatten(x, name=''):
        if type(x) is dict:
            for a in x:
                flatten(x[a], name + a + '_')
        elif type(x) is list:
            i = 0
            for a in x:
                flatten(a, name + str(i) + '_')
                i += 1
        else:
            out[name[:-1]] = x
    flatten(y)
    return out


flat = flatten_json(data)


results = pd.DataFrame()
special_cols = []

columns_list = list(flat.keys())
for item in columns_list:
    try:
        row_idx = re.findall(r'\_(\d+)\_', item )[0]
    except:
        special_cols.append(item)
        continue
    column = re.findall(r'\_\d+\_(.*)', item )[0]
    column = column.replace('_', '')

    row_idx = int(row_idx)
    value = flat[item]

    results.loc[row_idx, column] = value

for item in special_cols:
    results[item] = flat[item]

输出:

^{pr2}$

相关问题 更多 >