我一直在尝试从https://anapioficeandfire.com/获取数据并将其加载到spark dataframe。 一切都很好,直到我达到角色1173,我得到了腐败记录错误
(AnalysisException: Since Spark 2.3, the queries from raw JSON/CSV files are disallowed when the referenced columns only include the internal corrupt record column)
我认为它与alias列中的转义字符有关,尽管我不知道如何解决它
下面是我的部分代码。这应该足以导致错误。也许有人已经解决了这个问题
import json
import requests
def send_request(api_object):
page = 1173
url = 'https://anapioficeandfire.com/api/'
req = url + api_object + '?page=' + str(page)+ '&pageSize=1'
response = requests.get(req)
results = response.json()
return results
dbutils.fs.put("books.json", str(send_request('characters')), True)
df = spark.read.json("books.json", multiLine=True)
#df = spark.read.json(sc.parallelize([send_request('characters')]))
display(df)
send_request('characters')
目前没有回答
相关问题 更多 >
编程相关推荐