1

I am using PDFBox and have the following code snippet, which is reading a PDF file and converting it to image (PNG). It is working well, the only problem is that it is completely losing the barcode value that is in the PDF file.

Does anyone know how to work around this with PDFBox? Is that even possible? Thanks.

PDDocument doc = PDDocument.load(new File("INPUT.pdf"));
PDPage page = (PDPage) doc.getDocumentCatalog().getAllPages().get(0);
BufferedImage image = page.convertToImage();
File outputfile = new File("image.png");
ImageIO.write(image, "png", outputfile);
Shaggy
  • 1,444
  • 1
  • 23
  • 34
pabhb
  • 49
  • 1
  • 4
  • Please provide the pdf in question. That been said, PDFBox conversion to image does not support all pdf features. Thus, certain losses are to be expected. – mkl Feb 26 '14 at 05:03
  • I've had the same experience. This is a good question. There must be a way to do this with PDFBox... Or is it acceptable that it only copies text (and not images of any kind) when converting PDF to an image? Doesn't seem to make sense – Don Cheadle Feb 18 '15 at 00:26
  • it seems JPedal may be able to do this better than PDFBox currently http://stackoverflow.com/questions/22332791/converting-pdf-to-image-with-proper-formatting (OP's own accepted answer uses JPedal) – Don Cheadle Feb 18 '15 at 00:35

1 Answers1

0

The barcode image is in a format that is not recognized by pdfbox. You are missing some optional extensions like these:

  • Reading JBIG2 images: JBIG2 ImageIO or JBIG2-Image-Decoder
  • Reading JPEG 2000 (JPX) images: JAI Image I/O Tools Core

More information here.

david.perez
  • 6,090
  • 4
  • 34
  • 57