我需要用Python中不同级别的嵌套JSON数组将JSON展平
我的JSON的一部分看起来像:
{
"data": {
"workbooks": [
{
"projectName": "TestProject",
"name": "wkb1",
"site": {
"name": "site1"
},
"description": "",
"createdAt": "2020-12-13T15:38:58Z",
"updatedAt": "2020-12-13T15:38:59Z",
"owner": {
"name": "user1",
"username": "John"
},
"embeddedDatasources": [
{
"name": "DS1",
"hasExtracts": false,
"upstreamDatasources": [
{
"projectName": "Data Sources",
"name": "DS1",
"hasExtracts": false,
"owner": {
"username": "user2"
}
}
],
"upstreamTables": [
{
"name": "table_1",
"schema": "schema_1",
"database": {
"name": "testdb",
"connectionType": "redshift"
}
},
{
"name": "table_2",
"schema": "schema_2",
"database": {
"name": "testdb",
"connectionType": "redshift"
}
},
{
"name": "table_3",
"schema": "schema_3",
"database": {
"name": "testdb",
"connectionType": "redshift"
}
}
]
},
{
"name": "DS2",
"hasExtracts": false,
"upstreamDatasources": [
{
"projectName": "Data Sources",
"name": "DS2",
"hasExtracts": false,
"owner": {
"username": "user3"
}
}
],
"upstreamTables": [
{
"name": "table_4",
"schema": "schema_1",
"database": {
"name": "testdb",
"connectionType": "redshift"
}
}
]
}
]
}
]
}
}
输出应该是这样的
尝试使用json_normalize
,但无法使其工作。当前通过使用循环读取嵌套数组和使用键读取值来解析它。正在寻找规范JSON的更好方法
这里有一个部分解决方案:
首先将数据保存在与脚本相同的目录中,作为名为
data.json
的JSON file
输出:
下一步是什么?
我认为如果您提前一点重新构造数据(例如展平
'database': {'name': 'testdb', 'connectionType': 'redshift'}
),您将能够向meta
参数添加更多的fields
正如您在json_normalize的documentation中所看到的,这里使用的四个参数是:
数据:
dict or list of dicts
:记录路径:
str or list of str
:默认无meta:
list of paths (str or list of str)
:默认无记录前缀:
str
:默认无相关问题 更多 >
编程相关推荐