有 Java 编程相关的问题?

你可以在下面搜索框中键入要查询的问题!

带有PDFBox的java标记PDF

是否可以使用PDFBox创建带标签的PDF(PDF/UA)?看起来PDFBox有一个API(包org.apache.pdfbox.pdmodel.documentinterchange.taggedpdf),但我找不到任何教程或代码示例

使用下面的代码,我生成了一个包含图像的PDF文件,屏幕阅读器NVDA(在我的例子中)识别它并读取“。。。图形替代说明'。但是,辅助功能检查器PAC 2显示错误:“图像对象未标记”

        PDDocument doc = new PDDocument();
        PDPage page = new PDPage();
        doc.addPage(page);
        PDDocumentCatalog documentCatalog = doc.getDocumentCatalog();

        PDImageXObject pdImage = PDImageXObject.createFromFile(imagePath, doc);
        PDPageContentStream contents = new PDPageContentStream(doc, page);
        contents.drawImage(pdImage, 100, 600, pdImage.getWidth() / 2, pdImage.getHeight() / 2);
        contents.close();

        PDStructureTreeRoot treeRoot = new PDStructureTreeRoot();
        PDStructureElement structureElement = new PDStructureElement(StandardStructureTypes.Figure, treeRoot);
        structureElement.setPage(page);

        PDMarkedContent markedImg = new PDMarkedContent(COSName.IMAGE, new COSDictionary());
        markedImg.addXObject(pdImage);

        structureElement.appendKid(markedImg);
        structureElement.setAlternateDescription("Alternate Description");
        treeRoot.appendKid(structureElement);
        documentCatalog.setStructureTreeRoot(treeRoot);
        // ....
        doc.save(fileName);

你能提供一些关于这个主题的解释或代码示例吗


共 (1) 个答案

  1. # 1 楼答案

    我举了一个工作示例,演示如何使用PDFBox 2创建可访问的PDF: https://github.com/martinlovell/accessible-pdfbox-example

    问题代码中缺少一些东西。标记的内容需要alt文本,我相信您需要标记内容的mcid

    示例项目更详细地演示了您需要什么

    应该是这样的:

    PDPageContentStream contents = new PDPageContentStream(doc, page);
    
    // the content in the stream needs an id
    int mcid = 5;
    COSDictionary dictionary = new COSDictionary();
    dictionary = new COSDictionary();
    dictionary(COSName.MCID, mcid);
    
    // wrap image drawing in marked content
    contents.beginMarkedContent(COSName.IMAGE, PDPropertyList.create(dictionary));
    contents.drawImage(pdImage, 100, 600, pdImage.getWidth() / 2, pdImage.getHeight() / 2);
    contents.endMarkedContent();
    
    contents.close();
    
    PDStructureTreeRoot treeRoot = new PDStructureTreeRoot();
    documentCatalog.setStructureTreeRoot(treeRoot);
    PDStructureElement structureElement = new PDStructureElement(StandardStructureTypes.Figure, treeRoot);
    structureElement.setPage(page);
    structureElement.setAlternateDescription("Alternate Description");
    
    // Set alt text on marked content for structure.  
    // This is the dictionary with the mcid used in beginMarkedContent.
    dictionary.setString(COSName.ALT, "Alternate Description");
    PDMarkedContent markedImg = new PDMarkedContent(COSName.IMAGE, dictionary);
    markedImg.addXObject(pdImage);
    structureElement.appendKid(markedImg);