UnicodeDecodeError:“charmap”编解码器无法解码位置386中的字节0x8d:字符映射到<undefined>

2024-10-04 11:30:09 发布

男 | 程序猿一只，喜欢编程写python代码。

我试图使用slate库读取pdf文件，但它引发以下错误：

import slate

pdf = 'tabla9.pdf'

with open(pdf,encoding="utf-8") as f:

doc = slate.PDF(f)

for page in doc[:2]:
   print(page)

完全错误：

^{pr2}$

{cd52>行^：

class PDF(list):
    def __init__(self, file, password='', just_text=1, check_extractable=True, char_margin=1.0, line_margin=0.1, word_margin=0.1):
        self.parser = PDFParser(file)

pdfparser.py，第646行：

def __init__(self, fp):
        PSStackParser.__init__(self, fp)

psparser.py，第189行：

class PSStackParser(PSBaseParser):

    def __init__(self, fp):
        PSBaseParser.__init__(self, fp)

psparser.py，第134行：

class PSBaseParser:

    """Most basic PostScript parser that performs only tokenization.
    """
    def __init__(self, fp):
        data = fp.read()

文件“C:\Python3\lib\编解码器.py“，第322行，解码中（结果，消耗）=自身.\u buffer_decode（数据，自我错误，最终） UnicodeDecodeError:“utf-8”编解码器无法对位置10中的字节0xe2进行解码：连续字节无效：

def decode(self, input, final=False):
    # decode input (taking the buffer into account)
    data = self.buffer + input
    (result, consumed) = self._buffer_decode(data, self.errors, final)

我在Windows10上使用Python3.7。在

Tags： py margin self input data pdf init def

1条回答

网友

1楼 · 发布于 2024-10-04 11:30:09

PDF文件是二进制的，不适合用编码的文本模式打开它。在

尝试：

with open(pdf, "rb") as f:

UnicodeDecodeError:“charmap”编解码器无法解码位置386中的字节0x8d:字符映射到<undefined>

相关问题更多 >

编程相关推荐

热门问题

热门文章

UnicodeDecodeError:“charmap”编解码器无法解码位置386中的字节0x8d:字符映射到<undefined>

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >