1

Am trying to images as new page to the pdf files, the jpeg images are too large in size, though it gets added fine, am facing issue while converting the pdf to Tiff, its throwing outofmemory exception, is there a way to compress these jpeg files

Below code to convert images to pdf

PDImageXObject image = PDImageXObject.createFromFile(imagePath, doc);           
            PDPage page = new PDPage(new PDRectangle(image.getWidth(), image.getHeight()));
            doc.addPage(page);
            try (PDPageContentStream contents = new PDPageContentStream(doc, page)) {
                contents.drawImage(image, 0,0);     
                contents.close();

            }

Below to convert to pdf to tiff

document = PDDocument.load(new File(pdfFilename));
            PDFRenderer pdfRenderer = new PDFRenderer(document);
            BufferedImage[] images = new BufferedImage[document.getNumberOfPages()];

            for (int i = 0; i < images.length; i++) {
                PDPage page = (PDPage) document.getPage(i);
                BufferedImage image;
                try {
                    image = pdfRenderer.renderImageWithDPI(i, 300, ImageType.RGB); // its throwing outofmemory error at this line
                    images[i] = image;
                } catch (IOException e) {
                    LOGGER.error("Exception while reading merged pdf file:" + e.getMessage());
                    throw e;
                }
            }

            File tempFile = new File(tiffFileName+".tiff");
            ImageWriter writer = ImageIO.getImageWritersByFormatName("TIFF").next();
            ImageOutputStream output = ImageIO.createImageOutputStream(tempFile);
            writer.setOutput(output);
            ImageWriteParam params = writer.getDefaultWriteParam();
            params.setCompressionMode(ImageWriteParam.MODE_EXPLICIT);
            writer.prepareWriteSequence(null);
            for (int i = 0; i < images.length; i++) {
                params.setCompressionType("JPEG");
                writer.writeToSequence(new IIOImage(images[i], null, null), params);
            }
            writer.endWriteSequence();
Divya Neelaiah
  • 59
  • 1
  • 1
  • 6
  • "I’m getting an OutOfMemoryError. What can I do?": https://pdfbox.apache.org/2.0/faq.html#outofmemoryerror A good start is -Xmx2g and then go down from there. – Tilman Hausherr Feb 01 '18 at 10:54
  • Please make sure you're using the latest version, which is 2.0.8. Memory usage has been optimized in several versions since 2.0.0. (I mention this just in case). – Tilman Hausherr Feb 01 '18 at 10:58
  • HI Tilman, yes we are using the latest version. – Divya Neelaiah Feb 01 '18 at 11:15
  • 1
    One easy way to reduce memory in you code/app, is to process one page at a time. Your code currently reads and renders *all* the pages into memory (the `images` array), and then write all the pages later. You could render each page and write it to the TIFF inside the loop instead. – Harald K Feb 01 '18 at 11:17
  • Heh heh, I missed that one, despite that's the third item in the FAQ I linked to and I wrote that segment myself. – Tilman Hausherr Feb 01 '18 at 12:11
  • @TilmanHausherr i added few suggestions from the link, it did help for now, but am still worried because when its live there will lot of images to process – Divya Neelaiah Feb 01 '18 at 12:30
  • 1
    If you can, run a non live test with an enormous amount of PDFs from your client. If you fixed the problem that Harald mentioned, the biggest risk are PDFs that use a lot of space. Typically these are 1) complex beautiful PDFs done by illustrators, i.e. not invoices 2) PDFs with scans at a very high resolution, e.g. 600dpi or even higher. – Tilman Hausherr Feb 01 '18 at 12:36
  • @haraldK thank you, i updated the code as you suggested – Divya Neelaiah Feb 01 '18 at 12:52

1 Answers1

1

To avoid the out of memory exception, try starting your application by passing it Xms and Xmx parameters, using suitable values. See this post What are the Xms and Xmx parameters when starting JVMs? and https://docs.oracle.com/cd/E21764_01/web.1111/e13814/jvm_tuning.htm#PERFM150:

Setting initial and minimum heap size

-Xms Oracle recommends setting the minimum heap size (-Xms) equal to the maximum heap size (-Xmx) to minimize garbage collections.

Setting maximum heap size

-Xmx Setting a low maximum heap value compared to the amount of live data decrease performance by forcing frequent garbage collections.

6006604
  • 7,577
  • 1
  • 22
  • 30