如何使用Python将JSON数据转换为Avro格式

2024-09-29 02:29:16 发布

您现在位置:Python中文网/ 问答频道 /正文

我想将下面的JSON数据转换为avro格式,我使用下面的代码片段以avro格式编写JSON数据,但收到一个错误。如果有人能帮上忙,那就太好了

from fastavro import writer, reader, schema
from rec_avro import to_rec_avro_destructive, from_rec_avro_destructive, rec_avro_schema

def getweatherdata():
    url = 'https://api.openweathermap.org/data/2.5/onecall?lat=33.441792&lon=-94.037689&exclude=hourly,daily&appid=' + apikey
    response = requests.get(url)
    data = response.text
    return data
 
def turntoavro():
    avro_objects = (to_rec_avro_destructive(rec) for rec in getweatherdata())
    with open('json_in_avro.avro', 'wb') as f_out:
        writer(f_out, schema.parse_schema(rec_avro_schema()), avro_objects)



turntoavro()

    Error details:
    
      File "fastavro/_write.pyx", line 269, in fastavro._write.write_record
    TypeError: Expected dict, got str
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "datalake.py", line 30, in <module>
        turntoavro()
      File "datalake.py", line 26, in turntoavro
        writer(f_out, schema.parse_schema(rec_avro_schema()), avro_objects)
      File "fastavro/_write.pyx", line 652, in fastavro._write.writer
      File "fastavro/_write.pyx", line 605, in fastavro._write.Writer.write
      File "fastavro/_write.pyx", line 341, in fastavro._write.write_data
      File "fastavro/_write.pyx", line 278, in fastavro._write.write_record
    AttributeError: 'str' object has no attribute 'get'

样本数据:

    {
      "lat": 33.44,
      "lon": -94.04,
      "timezone": "America/Chicago",
      "timezone_offset": -18000

   }

Tags: 数据infromdataschemalineavrofile
2条回答

为了检索对请求的响应,您使用了response.text,它以字符串而不是JSON格式返回响应。您必须使用response.json()将其转换为JSON格式:

import json    
def getweatherdata():
    url = 'https://api.openweathermap.org/data/2.5/onecall?lat=33.441792&lon=-94.037689&exclude=hourly,daily&appid=' + apikey
    response = requests.get(url)
    data = response.json()
    return data
     
def turntoavro():
    avro_objects = (to_rec_avro_destructive(rec) for rec in getweatherdata())
    with open('json_in_avro.avro', 'wb') as f_out:
        writer(f_out, schema.parse_schema(rec_avro_schema()), avro_objects)
    
    
    
turntoavro()

正如其中一个答案中提到的,您可能希望使用response.json()而不是response.text,这样您就可以得到一个实际的JSON字典

但是,另一个问题是getweatherdata()返回单个字典,因此当您执行avro_objects = (to_rec_avro_destructive(rec) for rec in getweatherdata())操作时,您正在迭代该字典中的键。相反,您应该执行avro_objects = [to_rec_avro_destructive(getweatherdata())]

我相信这个代码应该适用于您:

from fastavro import writer, reader, schema
from rec_avro import to_rec_avro_destructive, from_rec_avro_destructive, rec_avro_schema

def getweatherdata():
    url = 'https://api.openweathermap.org/data/2.5/onecall?lat=33.441792&lon=-94.037689&exclude=hourly,daily&appid=' + apikey
    response = requests.get(url)
    data = response.json()
    return data
 
def turntoavro():
    avro_objects = [to_rec_avro_destructive(getweatherdata())]
    with open('json_in_avro.avro', 'wb') as f_out:
        writer(f_out, schema.parse_schema(rec_avro_schema()), avro_objects)

turntoavro()

相关问题 更多 >