<p>我认为最好的方法是先使用查询作业。在</p>
<ol>
<li>您需要从某处提取表并运行查询作业</li>
<li>以不带标题的CSV格式运行提取</li>
</ol>
<p>有这样做的代码</p>
<pre><code>job_config = bigquery.QueryJobConfig()
gcs_filename = 'file_with_nulls*.json.gzip'
table_ref = client.dataset(dataset_id).table('my_null_table')
job_config.destination = table_ref
job_config.write_disposition = bigquery.WriteDisposition.WRITE_TRUNCATE
# Start the query, passing in the extra configuration.
query_job = client.query(
"""#standardSql
select TO_JSON_STRING(t) AS json from `project.dataset.table` as t ;""",
location='US',
job_config=job_config)
while not query_job.done():
time.sleep(1)
#check if table successfully written
print("query completed")
job_config = bigquery.ExtractJobConfig()
job_config.compression = bigquery.Compression.GZIP
job_config.destination_format = (
bigquery.DestinationFormat.CSV)
job_config.print_header = False
destination_uri = 'gs://{}/{}'.format(bucket_name, gcs_filename)
extract_job = client.extract_table(
table_ref,
destination_uri,
job_config=job_config,
location='US') # API request
extract_job.result()
print("extract completed")
</code></pre>
<p>完成所有操作后,您可以删除在步骤1中创建的临时表。
如果您快速完成,成本将非常低(每月1TB的存储是20美元—因此,对于25GB的存储,即使是1小时,也将是20/30/24=3美分)</p>