Pandas:数据帧只有一个条目

2024-09-29 19:22:09 发布

您现在位置:Python中文网/ 问答频道 /正文

我目前正在比较来自不同来源的德国电晕数据。为此,我使用RKI API并获得以下数据帧:

0 ObjectId[{'name':'IdBundesland','type':'ESRIFELDTY…True[{'attributes':{'IdBundesland':5',Bundeslan…ObjectId True

但是这个输出是错误的,因为它只有一个条目,而这个条目看起来不应该。有人知道原因是什么吗?也许pd.json_normalize(r_json)但是如果没有这个,我会得到一个ValueError

我的代码(代码中的API链接):


import requests
import pandas as pd
from pandas.io.json import json_normalize
import json

link = 'https://services7.arcgis.com/mOBPykOjAyBO2ZKk/arcgis/rest/services/RKI_COVID19/FeatureServer/0/query?where=1%3D1&outFields=*&outSR=4326&f=json'
payload = {}
response = requests.request(
    method='get',
    url=link,
    params=payload,
    timeout=5
)

r_json = response.json()
r_json = pd.json_normalize(r_json)
# print(response.status_code)
# print(r_json)
df = pd.DataFrame.from_dict(r_json)
print(df)


Tags: 数据代码importapijsontrueresponse条目
2条回答

您需要按实际读取的行数增加偏移量,否则会得到多个或缺少的数据集:

...
print(f"exceededTransferLimit: {exceeded_transfer_limit}")

newData_df = pd.DataFrame([dict_row['attributes'] for dict_row in web_json['features']])
result_df = result_df.append(newData_df)

offset += len(newData_df) # get next set of rows

if len(result_df) ...

我认为您的请求查询过于复杂:

通过查看下面的网站,您似乎可以在url中使用“Resultofset”来循环浏览整个数据集

enter link description here

import requests
import pandas as pd

exceeded_transfer_limit: bool = True
offset: int = 0
result_df = pd.DataFrame()

while exceeded_transfer_limit:

    base_url = f"https://services7.arcgis.com/mOBPykOjAyBO2ZKk/arcgis/rest/services/RKI_COVID19/" \
               f"FeatureServer/0/query?where=1%3D1&outFields=*&outSR=4326&resultOffset={offset}&f=json"
    web_response = requests.get(url=base_url)
    web_json = web_response.json()
    print(web_json)
    
    # print(type(web_json))
    print(web_json.keys())
    if 'exceededTransferLimit' in web_json.keys():
        exceeded_transfer_limit = web_json['exceededTransferLimit'] #seems like this never return False, key just get ommited if not exceeded
    else:
        exceeded_transfer_limit=False
    print(f"exceededTransferLimit: {exceeded_transfer_limit}")

    result_df = result_df.append(pd.DataFrame([dict_row['attributes'] for dict_row in web_json['features']]))

    offset += 5000

    if len(result_df) >= 500000:
        result_df.to_csv(f"covid_data_{offset}.csv", sep=',', index=False)
        result_df = pd.DataFrame()

if len(result_df) > 0:
    result_df.to_csv(f"covid_data_{offset}_final.csv", index=False)

相关问题 更多 >

    热门问题