使用通配符和Autod将数据从GCS加载到BigQuery

2024-09-28 21:51:22 发布

您现在位置：Python中文网/ 问答频道 /正文

9909

网友

男 | 程序猿一只，喜欢编程写python代码。

在StackOverflow上过帐是新手。在

使用google.cloud.bigquerypythonsdk，我一直在尝试找到一个解决方案，在不定义表模式的情况下将数据从GCS加载到BigQuery。在

我的LoadJobConfig的autodetect设置为True，我在GCS URI中使用通配符（*）。在

我已经确认Autodetect可以使用通配符，但是加载作业失败了，因为我使用的数据源通常会自动检测到一个特定的列是float（例如0.30），但有时会添加运算符符号（例如<；0.10），因此需要是字符串。在

有人能在不定义模式的情况下想出解决方案吗？这是我的LoadJobConfig，我已经把它传递给了bigquery.client.Client的load_table_from_uri方法。在

source_uri = 'gs://%s/%s/%s/*' % (source, report_type, date)
job_config = bigquery.LoadJobConfig()
job_config.create_disposition = 'CREATE_IF_NEEDED'
job_config.skip_leading_rows = 1
job_config.source_format = 'CSV'
job_config.write_disposition = 'WRITE_TRUNCATE'
job_config.autodetect = True
job = bigquery_client.load_table_from_uri(source_uri, table_ref, job_config=job_config)
job.result()

Tags： config true source 定义 table 模式情况 job

1条回答

网友

1楼 · 发布于 2024-09-28 21:51:22

你的数据在某种程度上似乎被破坏了。在

我建议使用标志： max_bad_records，它跳过损坏的记录。在

详情请看这里：https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-csv#bigquery-import-gcs-file-python

使用通配符和Autod将数据从GCS加载到BigQuery

相关问题更多 >

编程相关推荐

热门问题

热门文章

使用通配符和Autod将数据从GCS加载到BigQuery

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >