无法用多个值展平Json文件

{ "number": "Abc", "date": "01.10.2016", "name": "R 3932", "locations": [ { "depTimeDiffMin": "0", "name": "Spital am Pyhrn Bahnhof", "arrTime": "", "depTime": "06:32", "platform": "2", "stationIdx": "0", "arrTimeDiffMin": "", "track": "R 3932" }, { "depTimeDiffMin": "0", "name": "Windischgarsten Bahnhof", "arrTime": "06:37", "depTime": "06:40", "platform": "2", "stationIdx": "1", "arrTimeDiffMin": "1", "track": "" }, { "depTimeDiffMin": "", "name": "Linz/Donau Hbf", "arrTime": "08:24", "depTime": "", "platform": "1A-B", "stationIdx": "22", "arrTimeDiffMin": "1", "track": "" } ] } { "number": "Xyz", "date": "01.10.2016", "name": "R 3932", "locations": [ { "depTimeDiffMin": "0", "name": "Spital am Pyhrn Bahnhof", "arrTime": "", "depTime": "06:32", "platform": "2", "stationIdx": "0", "arrTimeDiffMin": "", "track": "R 3932" }, { "depTimeDiffMin": "0", "name": "Windischgarsten Bahnhof", "arrTime": "06:37", "depTime": "06:40", "platform": "2", "stationIdx": "1", "arrTimeDiffMin": "1", "track": "" }, { "depTimeDiffMin": "", "name": "Linz/Donau Hbf", "arrTime": "08:24", "depTime": "", "platform": "1A-B", "stationIdx": "22", "arrTimeDiffMin": "1", "track": "" } ] }

import json import pandas as pd import numpy as np from pandas.io.json import json_normalize desired_width=500 pd.set_option('display.width', desired_width) np.set_printoptions(linewidth=desired_width) pd.set_option('display.max_columns', 100) with open('C:/Users/username/Desktop/samplejson.json') as f: data = json.load(f) def flatten_json(y): out = {} def flatten(x, name=''): if type(x) is dict: for a in x: flatten(x[a], name + a + '_') elif type(x) is list: i = 0 for a in x: flatten(a, name + str(i) + '_') i += 1 else: out[name[:-1]] = x flatten(y) return out for data in data: flat = flatten_json(data) new_flat = json_normalize(flat) dfs = pd.DataFrame(new_flat) print(dfs.head(2))

2条回答

网友

1楼 · 编辑于 2024-05-19 12:03:52

你可以这样做。假设j是完整的json对象。你知道吗

def parse(j):
    for item in j:
        data = pd.DataFrame([{k:v for k, v in item.items() if k != 'locations'}])
        locs = pd.DataFrame(item.get('locations'))
        yield pd.concat([data, locs], axis=1).fillna(method='ffill')

pd.concat(parse(j), axis=0, ignore_index=True)

         date    name number arrTime   ...                       name platform stationIdx   track
0  01.10.2016  R 3932    Abc           ...    Spital am Pyhrn Bahnhof        2          0  R 3932
1  01.10.2016  R 3932    Abc   06:37   ...    Windischgarsten Bahnhof        2          1        
2  01.10.2016  R 3932    Abc   08:24   ...             Linz/Donau Hbf     1A-B         22        
3  01.10.2016  R 3932    Xyz           ...    Spital am Pyhrn Bahnhof        2          0  R 3932
4  01.10.2016  R 3932    Xyz   06:37   ...    Windischgarsten Bahnhof        2          1        
5  01.10.2016  R 3932    Xyz   08:24   ...             Linz/Donau Hbf     1A-B         22

但是，您的JSON无效，因为您缺少一个,来分隔这两个对象。你知道吗

网友

2楼 · 编辑于 2024-05-19 12:03:52

如果文件中只有一条消息，则该文件是一个有效的json，但是如果您有更多的消息（当您放置它们时），json将不再有效（[JSON]: Introducing JSON）。示例：

>>> json.loads("{}")
{}
>>> json.loads("{} {}")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "c:\Install\x64\Python\Python\03.06.08\Lib\json\__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "c:\Install\x64\Python\Python\03.06.08\Lib\json\decoder.py", line 342, in decode
    raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 1 column 4 (char 3)
>>> json.loads("[{}, {}]")
[{}, {}]

有关详细信息，请查看[Python 3]: json - JSON encoder and decoder

使有效的json包含多个消息的最简单方法：

它们都应该用方括号括起来（“[”，“]”）
每个连续的2应该用逗号（“，”分隔）

就像“位置”子消息的情况一样。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章