tablapy在df中的一个特定列上返回“…”。其他一切似乎都起作用了，

2024-09-30 22:20:54 发布

您现在位置：Python中文网/ 问答频道 /正文

7853

网友

男 | 程序猿一只，喜欢编程写python代码。

预期行为：

读取PDF，将所有表格数据提取到PDF中

实际行为：

精细地读取PDF，提取大多数表数据，并将其保存到带有fp.write(df)的debug.txt文件中。当我查看debug.txt或观看终端打印它时，一列（名称）通常只返回“…”

就像是9/10次的回归…-有时只是第一页，但其余的都可以。有时候他们都很好。。。这似乎很奇怪

（我可能是个白痴，它可能在缩短它，因为它是迄今为止最长的字符串，长度为2-3倍。但我的谷歌Fu让我失望了）

示例输入（隐私保护范围内的名称）：

样本输出：

21        121         87    59 2003  ...         NaN        NaN         NaN
22        122         86    59 2026  ...         NaN        NaN         NaN
23        123         85    60 2038  ...         NaN        NaN         NaN
24        124         84    60 2050  ...         NaN        NaN         NaN
25        125         83    61 2056  ...         NaN        NaN         NaN
26        126         82    61 2095  ...         NaN        NaN         NaN

代码：

pagecount = 0
for filename in os.listdir(SPLITDIR):

    print("Working on: {}".format(filename))

    if not filename.endswith(".pdf"):
        print("I dont think {} is a PDF".format(filename))
        continue

    pagedf = read_pdf(SPLITPATH.format(pagecount) pages='all')
    #print(pagedf)
    debugextract.write(str(pagedf))

    pagedf = pd.DataFrame(pagedf)
    print(pagedf)

    pagecount += 1

Tags：数据 debug txt 名称 format pdf nan filename

1条回答

网友

1楼 · 发布于 2024-09-30 22:20:54

这不是来自tabla，而是ipython或Jupyter的显示设置

另见https://github.com/chezou/tabula-py/issues/216#issuecomment-581837621

tablapy在df中的一个特定列上返回“…”。其他一切似乎都起作用了，

相关问题更多 >

编程相关推荐

热门问题

热门文章

tablapy在df中的一个特定列上返回“…”。其他一切似乎都起作用了，

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >