from azure.datalake.store import core, lib, multithread
import pandas as pd
tenant_id = '<your Azure AD tenant id>'
username = '<your username in AAD>'
password = '<your password>'
store_name = '<your ADL name>'
token = lib.auth(tenant_id, username, password)
# Or you can register an app to get client_id and client_secret to get token
# If you want to apply this code in your application, I recommended to do the authentication by client
# client_id = '<client id of your app registered in Azure AD, like xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx'
# client_secret = '<your client secret>'
# token = lib.auth(tenant_id, client_id=client_id, client_secret=client_secret)
adl = core.AzureDLFileSystem(token, store_name=store_name)
f = adl.open('<your csv file path, such as data/test.csv in my ADL>')
df = pd.read_csv(f)
注意:如果您使用client_id&;client_secret进行身份验证,则必须为至少在Azure AD中具有Reader角色的应用添加必要的访问权限,如下图所示。有关访问安全性的详细信息,请参阅正式文档^{}。同时,关于如何在azuread中注册应用程序,你可以参考我对另一个SO线程How to get an AzureRateCard with Java?的回答。在
# -*- coding: utf-8 -*-
"""
Created on Wed Mar 20 11:37:19 2019
@author: Mohit Verma
"""
from azure.datalake.store import core, lib, multithread
token = lib.auth(tenant_id, username, password)
adl = core.AzureDLFileSystem(token, store_name=store_name)
# typical operations
adl.ls('')
adl.ls('tmp/', detail=True)
adl.ls('tmp/', detail=True, invalidate_cache=True)
adl.cat('littlefile')
adl.head('gdelt20150827.csv')
# file-like object
with adl.open('gdelt20150827.csv', blocksize=2**20) as f:
print(f.readline())
print(f.readline())
print(f.readline())
# could have passed f to any function requiring a file object:
# pandas.read_csv(f)
with adl.open('anewfile', 'wb') as f:
# data is written on flush/close, or when buffer is bigger than
# blocksize
f.write(b'important data')
adl.du('anewfile')
# recursively download the whole directory tree with 5 threads and
# 16MB chunks
multithread.ADLDownloader(adl, "", 'my_temp_dir', 5, 2**24)
请尝试这个代码,看看它是否帮助。为了其他与Azure Data Lake相关的示例请参阅以下github回购。在
我试图编写一个示例代码,将azuredatalake中的csv文件中的数据读取到pandas中的dataframe。在
下面是我的示例代码。在
注意:如果您使用} 。同时,关于如何在azuread中注册应用程序,你可以参考我对另一个SO线程How to get an AzureRateCard with Java?的回答。在
client_id
&;client_secret
进行身份验证,则必须为至少在Azure AD中具有Reader
角色的应用添加必要的访问权限,如下图所示。有关访问安全性的详细信息,请参阅正式文档^{有任何问题,请随时告诉我。在
下面是从ADLS读取csv文件的示例代码。在
请尝试这个代码,看看它是否帮助。为了其他与Azure Data Lake相关的示例请参阅以下github回购。在
https://github.com/Azure/azure-data-lake-store-python/tree/master/azure
另外,如果您想了解ADLS中不同类型的身份验证,请检查下面的代码库。在
https://github.com/Azure-Samples/data-lake-analytics-python-auth-options/blob/master/sample.py
相关问题 更多 >
编程相关推荐