41

I to want convert PDF pages into an image (PNG,JPEG/JPG or GIF). I want them in full-page sizes.

How can this be done using Java? What libraries are available for achieving this?

Aleksander Blomskøld
  • 18,374
  • 9
  • 76
  • 82
yohan.jayarathna
  • 3,423
  • 13
  • 56
  • 74
  • 1
    Oh, I would be interesting in knowing. It's good if there is a resizing option as well. – Nishant Feb 03 '11 at 12:19
  • 2
    http://stackoverflow.com/questions/356550/a-good-library-for-converting-pdf-to-tiff – jmj Feb 03 '11 at 12:20
  • @Nishant: when you get Image object you are free to transform it ;) – Maxym Feb 03 '11 at 12:41
  • Look at this [post](http://stackoverflow.com/questions/4929813/convert-pdf-to-thumbnail-image-in-java/4930488#4930488). – sdorra Feb 08 '11 at 07:15
  • look at https://github.com/shareefhiasat/PDFRenderer_withConvertPDFtoJPG_PNG_GIF_BMP – shareef May 16 '16 at 18:43
  • See https://pdfbox.apache.org/2.0/migration.html under PDF Rendering for details in how to do this in PDFBox 2.0.0 – gordon613 Dec 06 '17 at 16:37

6 Answers6

34

In Ghost4J library (http://ghost4j.sourceforge.net), since version 0.4.0 you can use a SimpleRenderer to do the job with few lines of code:

  1. Load PDF or PS file (use PSDocument class for that):

        PDFDocument document = new PDFDocument();
        document.load(new File("input.pdf"));
    
  2. Create the renderer

        SimpleRenderer renderer = new SimpleRenderer();
    
        // set resolution (in DPI)
        renderer.setResolution(300);
    
  3. Render

        List<Image> images = renderer.render(document);
    

Then you can do what you want with your image objects, for example, you can write them as PNG like this:

            for (int i = 0; i < images.size(); i++) {
                ImageIO.write((RenderedImage) images.get(i), "png", new File((i + 1) + ".png"));
            }

Note: Ghost4J uses the native Ghostscript C API so you need to have a Ghostscript installed on your box.

I hope it will help you :)

zippy1978
  • 632
  • 5
  • 3
  • 1
    Hey I am getting an error saying "Exception in thread "main" java.lang.UnsatisfiedLinkError: Unable to load library 'gsdll32': The specified module could not be found." I have already installed Ghostscript latest version. Please help :( – yohan.jayarathna Feb 07 '11 at 06:20
  • 1
    This means that the Ghostscript library was not found... On which OS are you working? Make sure the .dll / .so is on the system library path. – zippy1978 Feb 13 '11 at 18:08
  • Simply installing Ghostscript not work for me. I resolve this bu dropping gsdll32.dll into Eclipse Project folder. – WelcomeTo Aug 28 '12 at 07:12
  • is Ghost4J reliable in multi-threaded environments? I felt the documentation was vague – Don Cheadle Feb 19 '15 at 00:34
  • http://www.ghost4j.org/threadsafetyandmultithreading.html – Don Cheadle Feb 19 '15 at 02:58
  • For anyone having an issue, check this out: http://stackoverflow.com/questions/31996746/unable-to-load-library-gs-with-ghost4j – Wyetro Aug 13 '15 at 20:24
  • Caused by: java.lang.ClassNotFoundException: com.lowagie.text.pdf.PdfTemplate – amdev Aug 24 '17 at 14:44
32

Apache PDF Box can convert PDFs to jpg,bmp,wbmp,png, and gif.

The library even comes with a command line utility called PDFToImage to do this.

If you download the source code and look at the PDFToImage class you should be able to figure out how to use PDF Box to convert PDFs to images from your own Java code.

Dónal Boyle
  • 3,049
  • 2
  • 25
  • 40
  • it's somewhat inconsistent for images. If there is a "ColorPattern" (not an image but similar.. confusing) in the source PDF, it will not be copied over to the destination image. http://stackoverflow.com/questions/28589477/pdfbox-pdf-to-image-losing-qr-code-colorspace-pattern-doesnt-provide-a-non-str?noredirect=1#comment45487987_28589477 – Don Cheadle Feb 19 '15 at 00:30
  • but there may be improvements in PDFBox's 2.x release! (hoping) – Don Cheadle Feb 19 '15 at 00:32
  • 3
    See https://pdfbox.apache.org/2.0/migration.html under PDF Rendering for details in how to do this in PDFBox 2.0.0 – gordon613 Dec 06 '17 at 16:37
10

You will need a PDF renderer. There are a few more or less good ones on the market (ICEPdf, pdfrenderer), but without, you will have to rely on external tools. The free PDF renderers also cannot render embedded fonts, and so will only be good for creating thumbnails (what you eventually want).

My favorite external tool is Ghostscript, which can convert PDFs to images with a single command line invocation.

This converts Postscript (and PDF?) files to bmp for us, just as a guide to modify for your needs (Know you need the env vars for gs to work!):

pushd 
setlocal

Set BIN_DIR=C:\Program Files\IKOffice_ACME\bin
Set GS=C:\Program Files\IKOffice_ACME\gs
Set GS_DLL=%GS%\gs8.54\bin\gsdll32.dll
Set GS_LIB=%GS%\gs8.54\lib;%GS%\gs8.54\Resource;%GS%\fonts
Set Path=%Path%;%GS%\gs8.54\bin
Set Path=%Path%;%GS%\gs8.54\lib

call "%GS%\gs8.54\bin\gswin32c.exe" -q -dSAFER -dNOPAUSE -dBATCH -sDEVICE#bmpmono -r600x600 -sOutputFile#%2 -f %1

endlocal
popd

UPDATE: pdfbox is now able to embed fonts, so no need for Ghostscript anymore.

Daniel
  • 27,718
  • 20
  • 89
  • 133
  • Hi Daniel, thank you for quick reply, Can I automate Ghostscript using Java ? If it is possible how can I do it ? Where I can find very good Ghostscript tutorial, Thanks again! – yohan.jayarathna Feb 03 '11 at 12:26
  • May be have a look at Ghost4J http://ghost4j.sourceforge.net/coreapisamples.html – anergy Feb 03 '11 at 12:32
  • It's not quite right that "the free renderers can't redner embedded fonts" - at least jPodRenderer does so... – mtraut Feb 03 '11 at 14:52
  • 1
    @mtraut: jPodRenderer: Commercial licensing is available for a very moderate flat fee per developer seat of 4.900€... very free :) It is just free for GPLed projects. – Daniel Feb 03 '11 at 15:08
  • @daniel maybe i'm not up to date - but GPL is still one of the most common **free** licenses. I simply don't get it why it seems to be silly that commercial use costs money. And this free version is not a crippled subset... – mtraut Feb 03 '11 at 16:24
  • I know they wnt to make money, and it is their right to do so, but if it is not LGPL I cannot use it, so it's not really free, like apache or the like. – Daniel Feb 03 '11 at 16:44
  • I know this is old but I just want to say that suggesting Ghostscript as a free commercial friendly library is incorrect. Ghostscript is licensed under AGPL which is a more strict version of GPL – Zaid Amir Oct 01 '15 at 18:40
  • @ZaidAmir: I know it became even older, but Ghostscript can be used if used "at arms length", and if the product is not a tool that mimics ghostscripts behaviour or functionality. It is a bit problematic, OK, but I understood the license as it being possible to use. – Daniel Jan 28 '19 at 22:07
1

jPDFImages is not free but a commercial library which converts PDF pages to images in JPEG, TIFF or PNG format. The output image size is customizable.

alaris
  • 21
  • 1
  • Are you affiliated with that product? Please be sure to read the faq's on promotion http://stackoverflow.com/faq#promotion – Leigh Apr 13 '12 at 03:35
0

If GPL is fine you may have an additional look at jPodRenderer (SourceForge)

mtraut
  • 4,720
  • 3
  • 24
  • 33