有 Java 编程相关的问题?

你可以在下面搜索框中键入要查询的问题!

使用iText在AWS S3 bucket中编辑pdf文件

AWS S3存储桶有1个pdf文件。此pdf文件的内容需要使用iText Java库进行编辑。修改后的文件需要再次存储在S3存储桶中。目前,我们正在使用AWS Lambda函数。正在目标s3存储桶中创建空pdf文件,AWS cloudWatch中显示错误消息:“管道关闭”

Lambda java代码:

private String bucketName = "forms-storage";

public String getProposalPdf(InputRequest inputRequest, Context context) throws DocumentException, IOException{

    final BasicAWSCredentials awsCreds = new BasicAWSCredentials(ConstantValues.AccessKey, ConstantValues.SecretKey);
    final AmazonS3Client s3client = (AmazonS3Client) AmazonS3ClientBuilder.standard().withRegion(Regions.AP_SOUTH_1)
                    .withCredentials(new AWSStaticCredentialsProvider(awsCreds)).build();
    S3Object object = s3client.getObject(new GetObjectRequest(bucketName, "forms/COMBO ver 1.1.pdf"));
    InputStream objectData = object.getObjectContent();

    PdfReader reader;
    PdfStamper stamper = null;
    BaseFont bf;

    PipedOutputStream pdfBytes = new PipedOutputStream();

    try {           
        reader = new PdfReader(objectData);
        stamper = new PdfStamper(reader, pdfBytes);

        bf = BaseFont.createFont(BaseFont.HELVETICA, BaseFont.CP1252, BaseFont.NOT_EMBEDDED);

        PdfContentByte over = stamper.getOverContent(1);
        over.beginText();
        over.setColorFill(BaseColor.BLACK);
        over.setFontAndSize(bf, 12);
        over.setTextMatrix(120,717);
        over.showText("this is edited text");
        over.endText();

        PipedInputStream inputStream = new PipedInputStream(pdfBytes);

        ObjectMetadata meta = new ObjectMetadata();
        meta= object.getObjectMetadata();
        meta.setContentLength(inputStream.available());         

        s3client.putObject(new PutObjectRequest(bucketName, "forms/123.pdf", inputStream, meta));           

    } catch (IOException e) {
        e.printStackTrace();
    } catch (DocumentException e) {
        e.printStackTrace();
    } 
    finally
    {
        stamper.close();            
        objectData.close();
    }
    return "PDF Created";
}

共 (1) 个答案

  1. # 1 楼答案

    问题不在AWS或iText中,而是在处理PipedInputStreamPipedOutputStream的方式上

    特别是,大多数有价值的数据在调用stamper.close()时写入PDF,但在关闭母版之前设置了内容长度meta.setContentLength(inputStream.available());,因此长度无效。在调用putObject之后,inputStream实例被关闭(检查内部closedByReader字段),但是pdfBytes仍然与它连接,并且在inputStream关闭后无法写入它,因此当调用stamper.close();时,会出现异常,因为您无法再写入inputStream

    我认为用目前的方法解决这个问题的任何尝试都不够,因为在documentation中明确指出

    Typically, data is read from a PipedInputStream object by one thread and data is written to the corresponding PipedOutputStream by some other thread. Attempting to use both objects from a single thread is not recommended, as it may deadlock the thread.

    因此,一种解决方案是,尽管内存效率不高,但使用ByteArrayOutputStreamByteArrayInputStream

    ByteArrayOutputStream pdfBytes = new ByteArrayOutputStream();
    
    try {
        reader = new PdfReader(objectData);
        stamper = new PdfStamper(reader, pdfBytes);
    
        bf = BaseFont.createFont(BaseFont.HELVETICA, BaseFont.CP1252, BaseFont.NOT_EMBEDDED);
    
        PdfContentByte over = stamper.getOverContent(1);
        over.beginText();
        over.setColorFill(BaseColor.BLACK);
        over.setFontAndSize(bf, 12);
        over.setTextMatrix(120,717);
        over.showText("this is edited text");
        over.endText();
    
        stamper.close();
        objectData.close();
    
        ObjectMetadata meta = new ObjectMetadata();
        meta= object.getObjectMetadata();
        ByteArrayInputStream inputStream = new ByteArrayInputStream(pdfBytes.toByteArray());
        meta.setContentLength(inputStream.available());
    
        s3client.putObject(new PutObjectRequest(bucketName, "forms/123.pdf", inputStream, meta));      
    
    } catch (IOException e) {
        e.printStackTrace();
    } catch (DocumentException e) {
        e.printStackTrace();
    }
    

    通常情况下,PDF的大小不会太大,因此您可以将其存储在内存中。如果你想优化内存消耗,你应该在一个单独的线程中处理PDF文件。我建议查看this文章或搜索使用PipedInputStreamPipedOutputStream的通用示例