1

In another question I'm looking at org.apache.pdfbox.text.TextPosition and am trying to understand the x- and y-coordinates and what xscale and yscale are.

How do I relate to the overall page bounding box?

String[155.67801,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774261]a
String[159.55544,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.4868927]c
String[163.04233,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=1.9387207]t
String[164.98105,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774261]u
String[168.85847,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774261]a
String[172.7359,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=1.9387207]t
String[174.67462,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774261]o
String[178.55205,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=2.322281]r
Jason S
  • 184,598
  • 164
  • 608
  • 970
  • 2
    First of all, it would help when you add some more information. You can reference your other question. However, I thing important information should be part of this question as well. I think the example that has generated the output should be mentioned. I'm not very familiar with PDF but I have kind of used *PDFBox* in the past. You can get the bounding box with something like `document.getPage(0).getTrimBox()`. You might find more information on the [documentation](https://pdfbox.apache.org/docs/2.0.7/javadocs/org/apache/pdfbox/pdmodel/PDPage.html#getTrimBox()). – JojOatXGME Jan 22 '19 at 23:09
  • Updated the question -- the documentation for TextPosition is very sparse. – Jason S Jan 22 '19 at 23:17
  • OK, the `getTrimBox()` hint seems helpful. Judging from this particular PDF, the `getTrimBox()` coordinates are in the same coordinate system as the individual TextPosition x and y elements, and they are in points (72 points = 1 inch; what little graphic design experience I have is helpful). (example trimbox coordinates are `[0.0,0.0,226.038,98.248]` and at 300dpi I get a 941x409 bitmap.) Thanks! – Jason S Jan 22 '19 at 23:22
  • See also https://stackoverflow.com/questions/54040872/textposition-bounding-box-pdfbox and https://stackoverflow.com/questions/54019603/pdfbox-textposition-width-and-height-in-mm – Tilman Hausherr Jan 23 '19 at 06:51
  • 1
    The best page bounding box is getCropBox(). This is what you see. – Tilman Hausherr Jan 23 '19 at 06:54

0 Answers0