sparkonk8s资源暂存服务器与Python

2024-10-01 07:49:41 发布

您现在位置:Python中文网/ 问答频道 /正文

我一直在使用spark-on-k8sv2.2.0-kubernetes-0.5.0、kubernetesv1.9.0和Minikube v0.25.0来跟踪Running Spark on Kubernetes docs。在

我可以使用以下命令成功运行Python作业:

bin/spark-submit \
  --deploy-mode cluster \
  --master k8s://https://10.128.0.4:8443 \
  --kubernetes-namespace default \
  --conf spark.executor.instances=1 \
  --conf spark.app.name=spark-pi \
  --conf spark.kubernetes.driver.docker.image=kubespark/spark-driver-py:v2.2.0-kubernetes-0.5.0 \
  --conf spark.kubernetes.executor.docker.image=kubespark/spark-executor-py:v2.2.0-kubernetes-0.5.0 \
  --jars local:///opt/spark/examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar \
  local:///opt/spark/examples/src/main/python/pi.py 10

我能够在本地服务器上成功地运行与此资源相关的临时任务:

^{pr2}$

是否可以运行具有本地依赖关系的Python作业?我试过这个命令,但失败了:

bin/spark-submit \
  --deploy-mode cluster \
  --master k8s://https://10.128.0.4:8443 \
  --kubernetes-namespace default \
  --conf spark.executor.instances=1 \
  --conf spark.app.name=spark-pi \
  --conf spark.kubernetes.driver.docker.image=kubespark/spark-driver-py:v2.2.0-kubernetes-0.5.0 \
  --conf spark.kubernetes.executor.docker.image=kubespark/spark-executor-py:v2.2.0-kubernetes-0.5.0 \
  --conf spark.kubernetes.initcontainer.docker.image=kubespark/spark-init:v2.2.0-kubernetes-0.5.0 \
  --conf spark.kubernetes.resourceStagingServer.uri=http://10.128.0.4:31000 \
  ./examples/src/main/python/pi.py 10

我在驱动程序日志中发现了这个错误:

Error: Could not find or load main class .opt.spark.jars.RoaringBitmap-0.5.11.jar

事件日志中的这些错误:

MountVolume.SetUp failed for volume "spark-init-properties" : configmaps "spark-pi-1518224354203-init-config" not found
...
MountVolume.SetUp failed for volume "spark-init-secret" : secrets "spark-pi-1518224354203-init-secret" not found

Tags: dockerpyimageinitconfdriverpiexamples
1条回答
网友
1楼 · 发布于 2024-10-01 07:49:41

修复方法是通过 jars将示例jar作为依赖项提供:

bin/spark-submit \
   deploy-mode cluster \
   master k8s://https://10.128.0.4:8443 \
   kubernetes-namespace default \
   conf spark.executor.instances=1 \
   conf spark.app.name=spark-pi \
   conf spark.kubernetes.driver.docker.image=kubespark/spark-driver-py:v2.2.0-kubernetes-0.5.0 \
   conf spark.kubernetes.executor.docker.image=kubespark/spark-executor-py:v2.2.0-kubernetes-0.5.0 \
   conf spark.kubernetes.initcontainer.docker.image=kubespark/spark-init:v2.2.0-kubernetes-0.5.0 \
   conf spark.kubernetes.resourceStagingServer.uri=http://10.128.0.4:31000 \
   jars local:///opt/spark/examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar \
  ./examples/src/main/python/pi.py 10

我不知道为什么这样做(RoaringBitmap-0.5.11.jar应该存在于/opt/spark/jars中,并且在任何情况下都要添加到类路径中),但这暂时解决了我的问题。在

相关问题 更多 >