作业图太大，无法提交到Google云数据流

{ "code" : 400, "errors" : [ { "domain" : "global", "message" : "Request payload size exceeds the limit: x bytes.", "reason" : "badRequest" } ], "message" : "Request payload size exceeds the limit: x bytes.", "status" : "INVALID_ARGUMENT" } Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad Request { "code" : 400, "errors" : [ { "domain" : "global", "message" : "(3754670dbaa1cc6b): The job graph is too large. Please try again with a smaller job graph, or split your job into two or more smaller jobs.", "reason" : "badRequest", "debugInfo" : "detail: \"(3754670dbaa1cc6b): CreateJob fails due to Spanner error: New value exceeds the maximum size limit for this column in this database: Jobs.CloudWorkflowJob, size: 17278017, limit: 10485760.\"\n" } ], "message" : "(3754670dbaa1cc6b): The job graph is too large. Please try again with a smaller job graph, or split your job into two or more smaller jobs.", "status" : "INVALID_ARGUMENT" }

1条回答

网友

1楼 · 发布于 2024-10-01 22:37:41

这个问题有一个解决方法，可以让您将作业图的大小增加到100MB。您可以指定此实验： experiments=upload_graph

实验激活了一个新的提交路径，该路径将作业文件上传到GCS，并通过HTTP请求创建作业，该请求不包含作业图，只包含对作业图的引用

这样做的缺点是UI可能无法显示作业，因为它依赖API请求来共享作业

另外要注意的是：减小工作图的大小仍然是一个很好的做法

一个重要提示是，有时可以创建一些匿名DoFns/lambda函数，这些函数的闭包中将包含非常大的上下文，因此我建议查看代码中的任何闭包，并确保它们本身不包含非常大的上下文

也许避免匿名lambdas/dofn会有所帮助，因为上下文将是类的一部分，而不是序列化对象

相关问题更多 >

编程相关推荐

热门问题

热门文章