如何打开从OracleDB导出的.xlsx文件?

2024-10-01 05:01:16 发布

您现在位置:Python中文网/ 问答频道 /正文

一个团队成员每天从Oracle上的不同报告中提取几个报告,并将它们转储到各自的单页.xlsx文件中,这样他就可以用Excel打开这些报告并进行一些清理。我想用Pandas自动化整个任务,但是我还不能用Python提供的任何库打开下载的文件

当我尝试用Pandas打开文件时,XLRD抛出以下错误:

XLRDError                                 Traceback (most recent call last)
<ipython-input-19-0414e67ce665> in <module>
----> 1 df = pd.read_excel("small_data_samples/ruben/Actividades-Conectar Arreglos Pymes_30_07_19.xlsx")

~/.local/share/virtualenvs/datas--Z8piCS3/lib/python3.6/site-packages/xlrd/__init__.py in open_workbook(filename, logfile, verbosity, use_mmap, file_contents, encoding_override, formatting_info, on_demand, ragged_rows)
    143         if 'content.xml' in component_names:
    144             raise XLRDError('Openoffice.org ODS file; not supported')
--> 145         raise XLRDError('ZIP file contents not a known type of workbook')
    146 
    147     from . import book

XLRDError: ZIP file contents not a known type of workbook

我也尝试过使用Openpyxl,但没有更好的运气:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-18-a458f5d28de0> in <module>
----> 1 book = openpyxl.load_workbook("small_data_samples/ruben/Actividades-Conectar Arreglos Pymes_30_07_19.xlsx")

~/.local/share/virtualenvs/datas--Z8piCS3/lib/python3.6/site-packages/openpyxl/reader/excel.py in load_workbook(filename, read_only, keep_vba, data_only, guess_types, keep_links)
    221             ws._rels = rels
    222             ws_parser = WorkSheetParser(ws, fh, shared_strings)
--> 223             ws_parser.parse()
    224 
    225             if rels:

~/.local/share/virtualenvs/datas--Z8piCS3/lib/python3.6/site-packages/openpyxl/reader/worksheet.py in parse(self)
    128             tag_name = element.tag
    129             if tag_name in dispatcher:
--> 130                 dispatcher[tag_name](element)
    131                 element.clear()
    132             elif tag_name in properties:

~/.local/share/virtualenvs/datas--Z8piCS3/lib/python3.6/site-packages/openpyxl/reader/worksheet.py in parse_row(self, row)
    290 
    291         for cell in safe_iterator(row, self.CELL_TAG):
--> 292             self.parse_cell(cell)
    293 
    294 

~/.local/share/virtualenvs/datas--Z8piCS3/lib/python3.6/site-packages/openpyxl/reader/worksheet.py in parse_cell(self, element)
    209         if style_id is not None:
    210             style_id = int(style_id)
--> 211             style_array = self.styles[style_id]
    212 
    213         if coordinate:

IndexError: list index out of range

我还尝试使用ZipFile库打开文件并提取所需的.xml内容,在那里我发现:

[Content_Types].xml
_rels/
_rels/.rels
_rels/workbook.xml.rels
sheet1.xml
styles.xml
workbook.xml

我能够确定我正在寻找的内容,但这是一个非常沉重和复杂的问题,我想避免这样做,除非没有更好的方法

到目前为止,我还不能用Python打开这个文件,但是我可以在Windows和Linux下用Excel和LibreOffice打开这个文件。如果我这样做并再次保存文件,那么我就可以用Pandas直接用XLRD和Openpyxl打开它


Tags: 文件inpyselfsharelibpackageslocal