使用DataFrame打开JSON文件

2024-06-28 20:02:41 发布

您现在位置:Python中文网/ 问答频道 /正文

很抱歉问了这个琐碎的问题:

我有一个json文件first.json,我想用pandas.read_json打开它:

df = pandas.read_json('first.json')给出了下一个结果: it must be just 1 row with many columns

我需要的结果是一行,其中键('name','street','geo','servesCuisine'等)作为列。我试图更改不同的“orient”参数,但没有帮助。如何实现所需的DataFrame格式

这是我的json文件中的数据:

{
    "name": "La Continental (San Telmo)",
    "geo": {
        "longitude": "-58.371852",
        "latitude": "-34.616099"
    },
    "servesCuisine": "Italian",
    "containedInPlace": {},
    "priceRange": 450,
    "currenciesAccepted": "ARS",
    "address": {
        "street": "Defensa 701",
        "postalCode": "C1065AAM",
        "locality": "Autonomous City of Buenos Aires",
        "country": "Argentina"
    },
    "aggregateRatings": {
        "thefork": {
            "ratingValue": 9.3,
            "reviewCount": 3
        },
        "tripadvisor": {
            "ratingValue": 4,
            "reviewCount": 350
        }
    },
    "id": "585777"
}

Tags: 文件namejsonstreetdataframepandasdfread
2条回答

您可以使用Python命令读取JSON文件,将其转换为dict对象,然后手动拾取数据项以从中创建新的数据帧

import pandas as pd

# open/read the json data file
fo  = open("test11.json", "r")
injs = fo.read()
#print(injs)
inp_json = eval(injs)  #make it an object

# Or 
# inp_json = your_json_data

# prepare 1 row of data
axis1 = [[inp_json["name"], inp_json["address"]["street"], inp_json["geo"], inp_json["servesCuisine"],
          inp_json["aggregateRatings"]["tripadvisor"]["ratingValue"],
          inp_json["id"],
         ], ] #for data
axis0 = ['row_1', ]  #for index
heads = ["name", "add_.street", "geo", "servesCuisine",
        "agg_.tripadv_.ratingValue", "id", ]

# create a dataframe using the prepped values above
df0 = pd.DataFrame(axis1, index=axis0, columns=heads)


# see data in selected columns only
df0[["name","add_.street","id"]]

                             name  add_.street      id
row_1  La Continental (San Telmo)  Defensa 701  585777

你可以试试

with open("test.json") as fp:
    s = json.load(fp)

# flattened df, where nested keys -> column as `key1.key2.key_last`
df = pd.json_normalize(s)

# rename cols to innermost key only (be sure you don't overwrite cols)
cols = {col:col.split(".")[-1] for col in df.columns}
df = df.rename(columns=cols)

输出:

                         name servesCuisine  priceRange currenciesAccepted      id  ...    country ratingValue reviewCount ratingValue reviewCount
0  La Continental (San Telmo)       Italian         450                ARS  585777  ...  Argentina         9.3           3           4         350

相关问题 更多 >