PDF出血检测

2024-10-03 13:17:58 发布

您现在位置:Python中文网/ 问答频道 /正文

我目前正在编写一个小工具(Python+pyPdf)来测试pdf的打印机一致性。在

唉,我在第一个任务中已经感到困惑:检测PDF是否至少有3mm的“出血”(页面周围没有打印内容的边框)。我已经知道我无法检测到完整文档的出血,因为似乎没有全局性的出血。但在页面上,我可以发现总共五个不同的框:

  • mediaBox
  • bleedBox
  • trimBox
  • cropBox
  • artBox

我读了关于这些方框的pyPdf documentation,但我唯一理解的是mediaBox,它似乎代表了整个页面的大小(即论文)。在

很明显,bleedBox应该来定义出血,但事实并非总是如此。在

我注意到的另一件事是,例如使用PDF,所有这些框在每一页上都有完全相同的大小(意味着根本没有出血),但是当我打开它时,会出现大量出血;这使我认为单个文本元素有其自己的偏移量。在

所以,显然,仅仅计算mediaBox和{}的出血量是不可行的。在

如果有人能解释一下这些盒子到底是什么,以及我能从中得出什么结论(例如,一个盒子总是比另一个盒子小),我会非常高兴的。

另外一个问题是:有人能告诉我在documentation中到底提到了什么“默认用户空间单元”?我很确定这是在我的机器上引用mm,但我想在任何地方都强制执行mm。在


Tags: 工具文档内容pdfdocumentation打印机页面边框
1条回答
网友
1楼 · 发布于 2024-10-03 13:17:58

引用Adobe发布的PDF规范ISO 32000-1:2008

14.11.2 Page Boundaries

14.11.2.1 General

A PDF page may be prepared either for a finished medium, such as a sheet of paper, or as part of a prepress process in which the content of the page is placed on an intermediate medium, such as film or an imposed reproduction plate. In the latter case, it is important to distinguish between the intermediate page and the finished page. The intermediate page may often include additional production-related content, such as bleeds or printer marks, that falls outside the boundaries of the finished page. To handle such cases, a PDF page maydefine as many as five separate boundaries to control various aspects of the imaging process:

  • The media box defines the boundaries of the physical medium on which the page is to be printed. It may include any extended area surrounding the finished page for bleed, printing marks, or other such purposes. It may also include areas close to the edges of the medium that cannot be marked because of physical limitations of the output device. Content falling outside this boundary may safely be discarded without affecting the meaning of the PDF file.

  • The crop box defines the region to which the contents of the page shall be clipped (cropped) when displayed or printed. Unlike the other boxes, the crop box has no defined meaning in terms of physical page geometry or intended use; it merely imposes clipping on the page contents. However, in the absence of additional information (such as imposition instructions specified in a JDF or PJTF job ticket), the crop box determines how the page’s contents shall be positioned on the output medium. The default value is the page’s media box.

  • The bleed box (PDF 1.3) defines the region to which the contents of the page shall be clipped when output in a production environment. This may include any extra bleed area needed to accommodate the physical limitations of cutting, folding, and trimming equipment. The actual printed page may include printing marks that fall outside the bleed box. The default value is the page’s crop box.

  • The trim box (PDF 1.3) defines the intended dimensions of the finished page after trimming. It may be smaller than the media box to allow for production-related content, such as printing instructions, cut marks, or colour bars. The default value is the page’s crop box.

  • The art box (PDF 1.3) defines the extent of the page’s meaningful content (including potential white space) as intended by the page’s creator. The default value is the page’s crop box.

The page object dictionary specifies these boundaries in the MediaBox, CropBox, BleedBox, TrimBox, and ArtBox entries, respectively (see Table 30). All of them are rectangles expressed in default user space units. The crop, bleed, trim, and art boxes shall not ordinarily extend beyond the boundaries of the media box. If they do, they are effectively reduced to their intersection with the media box. Figure 86 illustrates the relationships among these boundaries. (The crop box is not shown in the figure because it has no defined relationship with any of the other boundaries.)

下面是一个很好的图形,显示了这些框之间的关系:

{1美元^

在许多情况下,只设置媒体盒的原因是

  1. 对于用于电子消费(即在计算机上阅读)的PDF,其他框几乎不重要;并且

  2. 即使在印前环境中,它们也不再像以前那样必要了,参见佩德罗在评论中提到的article

关于你的“额外问题”:默认情况下,用户空间单位是1/72英寸;但是,由于PDF 1.6,可以使用页面字典中的UserUnit条目将其更改为该大小的任意(不需要整数)倍数。在现有PDF中更改它本质上会缩放它,因为用户空间单位是页面中与设备无关的坐标系中的基本单位。因此,除非您想更新页面描述中引用坐标的每个命令以保持页面尺寸,否则您不会希望强制使用毫米用户空间单位。。。;)

相关问题 更多 >