擅长:python、mysql、java
<p>创建spark会话后,需要添加databricks提供的配置,以便将s3启用为delta存储,如:</p>
<pre><code>conf = spark.sparkContext._conf.setAll([('spark.delta.logStore.class','org.apache.spark.sql.delta.storage.S3SingleDriverLogStore')])
spark.sparkContext._conf.getAll()
</code></pre>
<blockquote>
<p>As the name suggests, the S3SingleDriverLogStore implementation only works properly when all concurrent writes originate from a single Spark driver. This is an application property, must be set before starting SparkContext, and cannot change during the lifetime of the context.</p>
</blockquote>
<p>从Databricks
访问<a href="https://docs.delta.io/latest/delta-storage.html#configuration-for-s3" rel="nofollow noreferrer">here</a>配置s3a路径访问密钥和密钥</p>