I am using PDFTextStripper
to extract text from a PDF.
I want to get the width and height, in millimeters, for each TextPosition
. This can be found from a given TextPostion
tp using tp.getWidth() and
tp.getHeight(). My problem is that the value returned is in display unit. I tried to look around to find the right conversion factor but I got confused. I know that PDFs uses different coordinate systems as explained in the PDF documentation (picture below).
I also found this post but It may be deprecated since I am using PDFBox 2.0.12. The variables described in this post does not exists anymore in the PDPage class but I found these constants in the PDRectangle class
/** user space units per inch */
private static final float POINTS_PER_INCH = 72;
/** user space units per millimeter */
private static final float POINTS_PER_MM = 1 / (10 * 2.54f) * POINTS_PER_INCH;
My question is: In which space a display unit is defined? and How can I convert it to millimiters.
Many thanks,