Pandas展平不规则嵌套的JSON

2024-09-29 01:23:35 发布

您现在位置:Python中文网/ 问答频道 /正文

我有以下JSON对象,并尝试将其转换为数据帧

数据:

{
  "data": {
    "docs": [
      {
        "id": "1",
        "col1": "foo",
        "col2": "123",
        "list": ["foo barr, fooo"]
      },
      {
         "id": "2",
        "col1": "abc",
        "col2": "321",
        "list": ["lirum epsum"]
      },
      {
         "id": "3",
        "col1": "foo",
        "col2": "123",
        "list": null
   
      }
      }
    ]
  }
}

理想情况下,列表列应该包含个字符串,而不是列表,如下所示:

id  col1    col2    list
1   foo     123     'foo barr, fooo'
2   abc     321     'lirum epsum'
3   foo     123      NaN

以下方法正在引发异常(TypeError:“NoneType”对象不可iterable):

with open(path_to_json, encoding='utf-8') as json_file:
    q= json.load(json_file)
    df = json_normalize(q['data'], record_path=['docs', 'list'])

Tags: 数据对象idjsondocsdatafoolist
2条回答
json = {
  "data": {
    "docs": [
      {
        "id": "1",
        "col1": "foo",
        "col2": "123",
        "list": ["foo barr, fooo"]
      },
      {
         "id": "2",
        "col1": "abc",
        "col2": "321",
        "list": ["lirum epsum"]
      },
      {
         "id": "3",
        "col1": "foo",
        "col2": "123",
        "list": np.nan
   
      },
      
    ]
  }
}
pd.DataFrame(json["data"]["docs"]).set_index("id")

给你

    id  col1    col2    list
    1   foo     123     [foo barr, fooo]
    2   abc     321     [lirum epsum]
    3   foo     123     NaN

我还不能添加注释来完成上面的回答,但是您可以使用以下代码将列列表转换为字符串

df['list']=df['list'].apply(lambda x: str(x).strip('[\']'))

相关问题 更多 >