无法从S3存储桶读取excel文件

2024-10-01 11:31:48 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一小段代码,它在本地系统中工作

import pandas as pd
import glob
import openpyxl

# path of folder
path=r'C:\Users\Preet\Desktop\python_files'

#Display list of files
filenames=glob.glob(path+"\*.xlsx")
print(filenames)

#initializing data frame
finalexcelsheet=pd.DataFrame()

#to iteriate excel
for file in filenames:
    df = pd.concat(pd.read_excel(file,sheet_name=None), ignore_index=True,sort=False)

    #print(df)
    finalexcelsheet=finalexcelsheet.append(df,ignore_index=True)
print(finalexcelsheet)
finalexcelsheet.to_excel('C:\\Users\\preet\\Desktop\\python_files\\final.xlsx',index=False).

然而,当我试图从s3存储桶读取相同的xlsx文件时,它只会创建一个空数据帧并停止,并说作业成功。下面是s3的代码。请告诉我下面的代码中是否缺少任何内容

import boto3
import pandas as pd
import glob
import openpyxl

# path of folder

bucketname = "sit-bucket-lake-raw-static-5464"
s3 = boto3.resource('s3')
my_bucket = s3.Bucket(bucketname)
source = "sit-bucket-lake-raw-static-5464/Staging/"
target = "sit-bucket-lake-raw-static-5464/branch/2020/12/"

#Display list of files
filenames=glob.glob(source+"\*.xlsx")
print(filenames)

#initializing data frame
finalexcelsheet=pd.DataFrame()

#to iteriate excel
for file in filenames:
    df = pd.concat(pd.read_excel(file,sheet_name=None), ignore_index=True,sort=False)

finalexcelsheet=finalexcelsheet.append(df,ignore_index=True)
print(finalexcelsheet)
finalexcelsheet.to_excel('target\final.xlsx',index=False)

Tags: oftopathimportdfindexs3files