我有一个json文件,我正试图扁平化。如果json文件中只有一条消息,则该函数可以正常工作,但是如果有多条消息,则会出现以下错误:
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 39 column 1 (char 952)
JSON文件示例
{
"number": "Abc",
"date": "01.10.2016",
"name": "R 3932",
"locations": [
{
"depTimeDiffMin": "0",
"name": "Spital am Pyhrn Bahnhof",
"arrTime": "",
"depTime": "06:32",
"platform": "2",
"stationIdx": "0",
"arrTimeDiffMin": "",
"track": "R 3932"
},
{
"depTimeDiffMin": "0",
"name": "Windischgarsten Bahnhof",
"arrTime": "06:37",
"depTime": "06:40",
"platform": "2",
"stationIdx": "1",
"arrTimeDiffMin": "1",
"track": ""
},
{
"depTimeDiffMin": "",
"name": "Linz/Donau Hbf",
"arrTime": "08:24",
"depTime": "",
"platform": "1A-B",
"stationIdx": "22",
"arrTimeDiffMin": "1",
"track": ""
}
]
}
{
"number": "Xyz",
"date": "01.10.2016",
"name": "R 3932",
"locations": [
{
"depTimeDiffMin": "0",
"name": "Spital am Pyhrn Bahnhof",
"arrTime": "",
"depTime": "06:32",
"platform": "2",
"stationIdx": "0",
"arrTimeDiffMin": "",
"track": "R 3932"
},
{
"depTimeDiffMin": "0",
"name": "Windischgarsten Bahnhof",
"arrTime": "06:37",
"depTime": "06:40",
"platform": "2",
"stationIdx": "1",
"arrTimeDiffMin": "1",
"track": ""
},
{
"depTimeDiffMin": "",
"name": "Linz/Donau Hbf",
"arrTime": "08:24",
"depTime": "",
"platform": "1A-B",
"stationIdx": "22",
"arrTimeDiffMin": "1",
"track": ""
}
]
}
我的代码:
import json
import pandas as pd
import numpy as np
from pandas.io.json import json_normalize
desired_width=500
pd.set_option('display.width', desired_width)
np.set_printoptions(linewidth=desired_width)
pd.set_option('display.max_columns', 100)
with open('C:/Users/username/Desktop/samplejson.json') as f:
data = json.load(f)
def flatten_json(y):
out = {}
def flatten(x, name=''):
if type(x) is dict:
for a in x:
flatten(x[a], name + a + '_')
elif type(x) is list:
i = 0
for a in x:
flatten(a, name + str(i) + '_')
i += 1
else:
out[name[:-1]] = x
flatten(y)
return out
for data in data:
flat = flatten_json(data)
new_flat = json_normalize(flat)
dfs = pd.DataFrame(new_flat)
print(dfs.head(2))
我正在尝试解析整个JSON文件,并将所有数据加载到dataframe中,以便开始使用它进行分析。如果文件中只有一条消息,那么代码就可以正常工作,并输出一个包含许多列的非常宽的表。你知道吗
如果我在JSON文件中有多条消息,我会得到上面附加的错误。我在stackoverflow中查看了许多解决方案,但它们似乎不是
有没有更简单的方法来读取和展平JSON文件。我试着使用熊猫的json\u规范化,但它只会使级别1变平。你知道吗
你可以这样做。假设
j
是完整的json对象。你知道吗但是,您的
JSON
无效,因为您缺少一个,
来分隔这两个对象。你知道吗如果文件中只有一条消息,则该文件是一个有效的json,但是如果您有更多的消息(当您放置它们时),json将不再有效([JSON]: Introducing JSON)。示例:
有关详细信息,请查看[Python 3]: json - JSON encoder and decoder
使有效的json包含多个消息的最简单方法:
就像“位置”子消息的情况一样。你知道吗
相关问题 更多 >
编程相关推荐