解析json文件以获取要插入到bigquery中的正确列

2024-09-26 18:15:42 发布

您现在位置:Python中文网/ 问答频道 /正文

我对Python比较陌生,我正试图从ECB free api获取一些汇率数据:

获取https://api.exchangeratesapi.io/latest?base=GBP

我希望最终以bigquery表中的数据结束。将数据加载到BQ是可以的,但是在将数据发送到BQ之前将其转换为正确的列/行格式是个问题。在

我想要一张这样的桌子:

Currency    Rate      Date
CAD         1.629..   2019-08-27
HKD         9.593..   2019-08-27
ISK         152.6..   2019-08-27
...         ...       ...

我试过几件事,但还没完全实现:

^{pr2}$

以下是原始json文件:

{  
   "rates":{  
      "CAD":1.6296861353,
      "HKD":9.593490542,
      "ISK":152.6759753684,
      "PHP":64.1305429339,
      "DKK":8.2428443501,
      "HUF":363.2604778172,
      "CZK":28.4888284523,
      "GBP":1.0,
      "RON":5.2195062629,
      "SEK":11.8475893558,
      "IDR":17385.9684034803,
      "INR":87.6742617713,
      "BRL":4.9997236134,
      "RUB":80.646191945,
      "HRK":8.1744110201,
      "JPY":130.2223254066,
      "THB":37.5852652759,
      "CHF":1.2042718318,
      "EUR":1.1055465269,
      "MYR":5.1255348081,
      "BGN":2.1622278974,
      "TRY":7.0550451616,
      "CNY":8.6717964026,
      "NOK":11.0104695256,
      "NZD":1.9192287707,
      "ZAR":18.6217151449,
      "USD":1.223287232,
      "MXN":24.3265563331,
      "SGD":1.6981194654,
      "AUD":1.8126540855,
      "ILS":4.3032293014,
      "KRW":1482.7479464473,
      "PLN":4.8146551248
   },
   "base":"GBP",
   "date":"2019-08-23"
}

Tags: 数据httpsapifreebase汇率bqgbp
2条回答

谢谢本·p的帮助。在

这是我的剧本,适用于感兴趣的人。它使用我的团队用于BQ加载的内部库,但其余的是pandas和requests:

from aa.py.gcp import GCPAuth, GCPBigQueryClient
from aa.py.log import StandardLogger
import requests, os, pandas as pd

# Connect to BigQuery
logger = StandardLogger('test').logger
auth = GCPAuth(logger=logger)
credentials_path = 'XXX'
credentials = auth.get_credentials(credentials_path)
gcp_bigquery = GCPBigQueryClient(logger=logger)
gcp_bigquery.connect(credentials)

# api-endpoint
URL = "https://api.exchangeratesapi.io/latest?base=GBP"

# sending get request and saving the response as response object
r = requests.get(url=URL)

# extracting data in json format
data = r.json()

# extract rates object from json
d = data['rates']

# split currency and rate for dataframe
df = pd.DataFrame.from_dict(d,orient='index')

# add date element to dataframe
df['date'] = data['date']

#column names
df.columns = ['rate', 'date']

# print dataframe
print(df)

# write dateframe to csv
df.to_csv('data.csv', sep='\t', encoding='utf-8')

#########################################
# write csv to BQ table
file_path = os.getcwd()
file_name = 'data.csv'
dataset_id = 'Testing'
table_id = 'Exchange_Rates'

response = gcp_bigquery.load_file_into_table(file_path, file_name, dataset_id, table_id, source_format='CSV', field_delimiter="\t", create_disposition='CREATE_NEVER', write_disposition='WRITE_TRUNCATE',skip_leading_rows=1)

欢迎光临!这个怎么样,作为解决问题的一种方法。在

# import the pandas library so we can use it's from_dict function:
import pandas as pd

# subset the json to a dict of exchange rates and country codes:
d = data['rates']

# create a dataframe from this data, using pandas from_dict function:
df = pd.DataFrame.from_dict(d,orient='index')

# add a column for date (this value is taken from the json data):
df['date'] = data['date']

# name our columns, to keep things clean
df.columns = ['rate','date']

这将为您提供:

^{pr2}$

在这种情况下,货币是数据帧的索引,如果您希望它作为它自己的列,只需添加: df['currency'] = df.index

然后可以将此数据帧写入.csv文件,或写入BigQuery。在

为此,我建议您看一看The BigQuery Client library,一开始可能有点难理解,所以您可能还想看看pandas.DataFrame.to_gbq,它更简单,但不太健壮(有关客户端库与pandas函数的详细信息,请参见this link)。在

相关问题 更多 >

    热门问题