擅长:python、mysql、java
<p>准备工作:</p>
<p>在spark配置文件中添加以下行,对于我的本地pyspark,它是<code>/usr/local/spark/conf/spark-default.conf</code></p>
<pre><code>spark.hadoop.fs.s3a.access.key=<your access key>
spark.hadoop.fs.s3a.secret.key=<your secret key>
</code></pre>
<p>python文件内容:</p>
^{pr2}$
<p>提交:</p>
<pre><code>spark-submit master local \
packages org.apache.hadoop:hadoop-aws:2.7.3,\
com.amazonaws:aws-java-sdk:1.7.4,\
org.apache.hadoop:hadoop-common:2.7.3 \
<path to the py file above>
</code></pre>