使用Databricks在Apache Spark中装载Azure Data Lake时出错

2024-06-01 12:00:55 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试用ApacheSpark中的以下Python代码装载Azure数据湖

def check(mntPoint):
  a= []
  for test in dbutils.fs.mounts():
    a.append(test.mountPoint)
  result = a.count(mntPoint)
  return result

mount = "/mnt/lake"

if check(mount)==1:
  resultMsg = "<div>%s is already mounted. </div>" % mount
else:
  dbutils.fs.mount(
  source = "wasbs://root@adlsprexxxxxdlsdev.blob.core.windows.net",
  mount_point = mount,
  extra_configs = {"fs.azure.account.key.adlspretxxxxdlsdev.blob.core.windows.net":""})
  resultMsg = "<div>%s was mounted. </div>" % mount

displayHTML(resultMsg)

但我不断发现以下错误:

shaded.databricks.org.apache.hadoop.fs.azure.AzureException: java.lang.IllegalArgumentException: Storage Key is not a valid base64 encoded string.

完全错误如下所示:

ExecutionError                            Traceback (most recent call last)
<command-3313750897057283> in <module>
      4   resultMsg = "<div>%s is already mounted. </div>" % mount
      5 else:
----> 6   dbutils.fs.mount(
      7   source = "wasbs://root@adlsprexxxxxxxkadlsdev.blob.core.windows.net",
      8   mount_point = mount,

/local_disk0/tmp/1619799109257-0/dbutils.py in f_with_exception_handling(*args, **kwargs)
    322                     exc.__context__ = None
    323                     exc.__cause__ = None
--> 324                     raise exc
    325             return f_with_exception_handling
    326 

有人能告诉我如何解决这个问题吗


Tags: incoredivnetiswindowscheckfs
1条回答
网友
1楼 · 发布于 2024-06-01 12:00:55

您需要提供存储密钥,而现在您有空字符串。通常,人们将存储密钥放入Azure KeyVault(并将其装载为机密作用域)或使用Databricks烘焙的机密作用域,然后通过dbutils.secrets.get访问该存储密钥(如documentation中所示):

dbutils.fs.mount(
  source = "wasbs://root@adlsprexxxxxdlsdev.blob.core.windows.net",
  mount_point = mount,
  extra_configs = {"fs.azure.account.key.adlspretxxxxdlsdev.blob.core.windows.net":
      dbuitils.secrets.get(scope_name, secret_name)})

相关问题 更多 >