是否可以使用Python库将PDF文件转换为其灰度等效文件?我尝试了ghostscript模块:
import locale
from io import BytesIO
import ghostscript as gs
ENCO = locale.getpreferredencoding()
STDOUT = BytesIO()
STDERR = BytesIO()
with open('adob_in.pdf', 'r') as infile:
ARGS = f"""DUMMY -sOutputFile=adob_out.pdf -sDEVICE=pdfwrite
-sColorConversionStrategy=Gray -dProcessColorModel=/DeviceGray
-dNOPAUSE -dBATCH {infile.name}"""
ARGSB = [arg.encode(ENCO) for arg in ARGS.split()]
gs.Ghostscript(*ARGSB, stdout=STDOUT, stderr=STDERR)
print(STDOUT.getvalue().decode(ENCO))
print(STDERR.getvalue().decode(ENCO))
标准输出和错误流为:
GPL Ghostscript 9.52 (2020-03-19)
Copyright (C) 2020 Artifex Software, Inc. All rights reserved.
This software is supplied under the GNU AGPLv3 and comes with NO WARRANTY:
see the file COPYING for details.
Processing pages 1 through 1.
Page 1
不幸的是,灰度PDF已损坏。实际上,使用Ghostscript进行调试会显示以下错误:
GPL Ghostscript 9.52 (2020-03-19)
Copyright (C) 2020 Artifex Software, Inc. All rights reserved.
This software is supplied under the GNU AGPLv3 and comes with NO WARRANTY:
see the file COPYING for details.
**** Error: Cannot find a 'startxref' anywhere in the file.
Output may be incorrect.
**** Error: An error occurred while reading an XREF table.
**** The file has been damaged. This may have been caused
**** by a problem while converting or transfering the file.
**** Ghostscript will attempt to recover the data.
**** However, the output may be incorrect.
**** Error: Trailer dictionary not found.
Output may be incorrect.
No pages will be processed (FirstPage > LastPage).
**** This file had errors that were repaired or ignored.
**** Please notify the author of the software that produced this
**** file that it does not conform to Adobe's published PDF
**** specification.
**** The rendered output from this file may be incorrect.
GS>
请注意,字符串ARGS
包含有效的ghostscript代码(在Linux命令行中使用GPL Ghostscript 9.52
进行测试),并且ARGSB
只是字符串的对应二进制表示形式:
print(ARGSB)
[b'DUMMY', b'-sOutputFile=adob_out.pdf', b'-sDEVICE=pdfwrite', b'-sColorConversionStrategy=Gray', b'-dProcessColorModel=/DeviceGray', b'-dNOPAUSE', b'-dBATCH', b'adob_in.pdf']
这项任务如何才能妥善完成?我的示例输入和输出文件可以在here中找到。事先非常感谢
我不知道如何通过ghostscript实现,但是下面使用pdf2image和img2pdf的代码可以达到这个目的:
带有灰度图像的PDF文件将保存为同一目录中的Gray_PDF.PDF
说明: 以下代码:
执行以下任务:
images
现在输入以下代码:
在同一目录中将图像再次保存为page_1.jpeg、page_2.jpeg等。它还列出了这些新图像的路径
最后,输入以下代码:
从先前创建的灰度图像创建名为Gray\u PDF的PDF,并将其保存在工作目录中
附加提示:如果您想使用OpenCV执行更多图像处理操作,则此方法为您提供了很大的灵活性,因为所有页面现在都是图像形式。只需确保所有操作都在第一个
with
语句中,即:相关问题 更多 >
编程相关推荐