在python中将Json数据加载到数据帧中

2024-09-30 04:36:46 发布

您现在位置:Python中文网/ 问答频道 /正文

list = {'masterId': 2, 'name': 'name', 'description': 'xyz', 'signalTypeRefId': 4, 'unitOfMeasureRefId': 1, 'precision': 1, 'min': -125, 'max': 125, 'isDeprecated': False}

我需要将上面的json加载到数据帧中,并尝试了以下操作,但没有成功

df = pd.DataFrame.from_dict(list, orient = 'index')
display(df)

错误:

TypeError: field 0: Can not merge type <class 'pyspark.sql.types.LongType'> and <class 'pyspark.sql.types.StringType'>


Tags: namedfsqldescriptionminmaxlistclass
3条回答


dct = {'masterId': 2, 'name': 'name', 'description': 'xyz', 'signalTypeRefId': 4, 'unitOfMeasureRefId': 1, 'precision': 1, 'min': -125, 'max': 125, 'isDeprecated': False}

df = pd.DataFrame.from_dict(dct, orient="index")
display(df)

"""
                        0
masterId                2
name                 name
description           xyz
signalTypeRefId         4
unitOfMeasureRefId      1
precision               1
min                  -125
max                   125
isDeprecated        False
"""

要将其作为一行,请使用.transpose()

df.transpose()
"""
Out[15]: 
  masterId  name description     ...        min  max isDeprecated
0        2  name         xyz     ...       -125  125        False
"""
data = pd.DataFrame([list])

有关将JSON转换为数据帧的更多信息,请查看下面的链接:)

https://pandas.pydata.org/docs/reference/api/pandas.read_json.html

在创建数据帧之前,需要将字典包装到列表中:

data = {'masterId': 2, 'name': 'name', 'description': 'xyz', 'signalTypeRefId': 4, 'unitOfMeasureRefId': 1, 'precision': 1, 'min': -125, 'max': 125, 'isDeprecated': False}

df = spark.createDataFrame([data])

df.show()
+-----------+------------+--------+---+----+----+---------+---------------+------------------+
|description|isDeprecated|masterId|max| min|name|precision|signalTypeRefId|unitOfMeasureRefId|
+-----------+------------+--------+---+----+----+---------+---------------+------------------+
|        xyz|       false|       2|125|-125|name|        1|              4|                 1|
+-----------+------------+--------+---+----+----+---------+---------------+------------------+

或者,您可以将其转换为pandas数据帧,并从中创建Spark数据帧,但仍需要将字典包装到列表中:

data = {'masterId': 2, 'name': 'name', 'description': 'xyz', 'signalTypeRefId': 4, 'unitOfMeasureRefId': 1, 'precision': 1, 'min': -125, 'max': 125, 'isDeprecated': False}

df = spark.createDataFrame(pd.DataFrame([data]))

df.show()
+--------+----+-----------+---------------+------------------+---------+----+---+------------+
|masterId|name|description|signalTypeRefId|unitOfMeasureRefId|precision| min|max|isDeprecated|
+--------+----+-----------+---------------+------------------+---------+----+---+------------+
|       2|name|        xyz|              4|                 1|        1|-125|125|       false|
+--------+----+-----------+---------------+------------------+---------+----+---+------------+

相关问题 更多 >

    热门问题