AWSGLUE python包ls无法访问目录

2024-10-02 00:43:44 发布

您现在位置:Python中文网/ 问答频道 /正文

我正试图在我的本地计算机(Windows+GitBash)上安装用于开发目的的本地awsglue

https://github.com/awslabs/aws-glue-libs/tree/glue-1.0

https://support.wharton.upenn.edu/help/glue-debugging

下面提到的Spark目录和py4j错误确实存在,但仍然存在错误

enter image description here 触发sh的目录如下:

user@machine xxxx64~/Desktop/lm_aws_glue/aws-glue-libs-glue-1.0/bin
$ ./glue-setup.sh
ls: cannot access 'C:\Spark\spark-3.1.1-bin-hadoop2.7/python/lib/py4j-*-src.zip': No such file or directory
rm: cannot remove 'PyGlue.zip': No such file or directory
./glue-setup.sh: line 14: zip: command not found

ls结果:

$ ls -l
total 7
-rwxr-xr-x 1 n1543781 1049089 135 May  5  2020 gluepyspark*
-rwxr-xr-x 1 n1543781 1049089 114 May  5  2020 gluepytest*
-rwxr-xr-x 1 n1543781 1049089 953 Mar  5 11:10 glue-setup.sh*
-rwxr-xr-x 1 n1543781 1049089 170 May  5  2020 gluesparksubmit*

Tags: https目录awsshsetupziplsmay
1条回答
网友
1楼 · 发布于 2024-10-02 00:43:44

最初的安装代码需要很少的调整和工程ok。仍然需要解决zip的问题

#!/usr/bin/env bash

#original code
#ROOT_DIR="$(cd $(dirname "$0")/..; pwd)"
#cd $ROOT_DIR

#re-written
ROOT_DIR="$(cd /c/aws-glue-libs; pwd)" 
cd $ROOT_DIR

SPARK_CONF_DIR=$ROOT_DIR/conf
GLUE_JARS_DIR=$ROOT_DIR/jarsv1

#original code
#PYTHONPATH="$SPARK_HOME/python/:$PYTHONPATH"
#PYTHONPATH=`ls $SPARK_HOME/python/lib/py4j-*-src.zip`:"$PYTHONPATH"

#re-written
PYTHONPATH="/c/Spark/spark-3.1.1-bin-hadoop2.7/python/:$PYTHONPATH"
PYTHONPATH=`ls /c/Spark/spark-3.1.1-bin-hadoop2.7/python/lib/py4j-*-src.zip`:"$PYTHONPATH"

# Generate the zip archive for glue python modules
rm PyGlue.zip
zip -r PyGlue.zip awsglue
GLUE_PY_FILES="$ROOT_DIR/PyGlue.zip"
export PYTHONPATH="$GLUE_PY_FILES:$PYTHONPATH"

# Run mvn copy-dependencies target to get the Glue dependencies locally
#mvn -f $ROOT_DIR/pom.xml -DoutputDirectory=$ROOT_DIR/jarsv1 dependency:copy-dependencies

export SPARK_CONF_DIR=${ROOT_DIR}/conf
mkdir $SPARK_CONF_DIR
rm $SPARK_CONF_DIR/spark-defaults.conf
# Generate spark-defaults.conf
echo "spark.driver.extraClassPath $GLUE_JARS_DIR/*" >> $SPARK_CONF_DIR/spark-defaults.conf
echo "spark.executor.extraClassPath $GLUE_JARS_DIR/*" >> $SPARK_CONF_DIR/spark-defaults.conf

# Restore present working directory
cd -

相关问题 更多 >

    热门问题