数据流BigQuery插入作业在大数据时立即失败

2019-04-22 (00:41:29) Executing BigQuery import job "dataflow_job_14675275193414385105". You can check its status with the... Executing BigQuery import job "dataflow_job_14675275193414385105". You can check its status with the bq tool: "bq show -j --project_id=X dataflow_job_14675275193414385105". 2019-04-22 (00:41:29) Workflow failed. Causes: S01:Create Dummy Element/Read+Call API+Transform JSON+Write to Bigquery /Wr... Workflow failed. Causes: S01:Create Dummy Element/Read+Call API+Transform JSON+Write to Bigquery /WriteToBigQuery/NativeWrite failed., A work item was attempted 4 times without success. Each time the worker eventually lost contact with the service. The work item was attempted on: beamapp-X-04212005-04211305-sf4k-harness-lqjg, beamapp-X-04212005-04211305-sf4k-harness-lgg2, beamapp-X-04212005-04211305-sf4k-harness-qn55, beamapp-X-04212005-04211305-sf4k-harness-hcsn

2条回答

网友

1楼 · 编辑于 2024-10-02 20:43:00

据我所知，在云数据流和apachebeam的pythonsdk中没有诊断OOM的选项（javasdk有可能）。我建议您在Cloud Dataflow issue tracker中打开feature request，以获取此类问题的更多详细信息。在

除了检查数据流作业日志文件之外，我建议您使用提供每个作业的资源使用情况的Stackdriver Monitoring tool来监视管道（如Total memory usage time）。在

关于Python SDK中分区函数的使用，以下代码（基于Apache Beam的documentation）将数据分成3个BigQuery加载作业：

def partition_fn(input_data, num_partitions):
      return int(get_percentile(lines) * num_partitions / 100)

    partition = input_data | beam.Partition(partition_fn, 3)

    for x in range(3):
      partition[x] | 'WritePartition %s' % x >> beam.io.WriteToBigQuery(
        table_spec,
        schema=table_schema,
        write_disposition=beam.io.BigQueryDisposition.WRITE_APPEND,
        create_disposition=beam.io.BigQueryDisposition.CREATE_IF_NEEDED)

网友

2楼 · 编辑于 2024-10-02 20:43:00

可能有助于调试的一件事是查看Stackdriver日志。在

如果您在Google console中打开数据流作业，并单击图形面板右上角的LOGS，那么应该会打开底部的logs面板。LOGS面板的右上角有一个指向Stackdriver的链接。这将为您提供有关此特定工作的工人/洗牌/等的大量日志信息。在

其中有很多内容，很难筛选出相关的内容，但希望您能够找到比A work item was attempted 4 times without success更有用的东西。例如，每个worker偶尔会记录它正在使用的内存量，可以将其与每个worker的内存量（基于机器类型）进行比较，以确定它们是否确实内存不足，或者您的错误是否发生在其他地方。在

祝你好运！在

相关问题更多 >

编程相关推荐

热门问题

热门文章