2
public String convertPdfPagesToImages(File file, String outputImageDir) 
{
PDDocument document = null;
try 
{
    document = PDDocument.load(file);
    PDFRenderer pdfRenderer = new PDFRenderer(document);
    for (int page = 0; page < document.getNumberOfPages(); ++page) 
    {
        BufferedImage bim = pdfRenderer.renderImageWithDPI(page, 300,ImageType.RGB);
        ImageIOUtil.writeImage(bim,  page+"- output.jpg", 300);
    }
    document.close();
} 
catch (IOException e) 
{
    e.printStackTrace();
    return null;
}
return ""; 

}

  • I am using above code to convert pdf pages to image. With 300 dpi. I came up with relation between x and y coordinates of pdf text and text marked in image. Xim= Xpdf*dpi/72; Yim= [Ypdf-(Hpdfpage/96)]*dpi/72; which seems to be working perfectly fine. However,not able to get relation of height and width of rectangle marked in image to that in pdf page. Could anyone help me with this? I am using pdfbox2.0.0 library.
  • Where exactly do you retrieve those coordinates `Xpdf, Ypdf` from? PDFBox unfortunately uses different coordinate systems in different classes and does not clearly indicate this. – mkl Sep 08 '16 at 13:50
  • I'd recommend to have a look at the DrawPrintTextLocations.java example in the source code download, maybe this can clarify things a bit. Also be aware that PDF coordinates start at the bottom left. And don't forget to update to 2.0.2. – Tilman Hausherr Sep 08 '16 at 14:48
  • I am using processTextPosition method of pdfbox to obtain pdf coordinates. – sharmishtha kulkarni Sep 09 '16 at 05:26
  • There's a fairly good description in https://stackoverflow.com/a/21523407/290182 of user coordinates. `org.apache.pdfbox.pdmodel.PDPage` has a private field `private static final int DEFAULT_USER_SPACE_UNIT_DPI = 72;` which matches your assumed 72 DPI. – beldaz Oct 11 '17 at 19:32

0 Answers0