1

I am using PDFTron to scan document and extract annotations information in some custom format. And I have problem with Ellipse and Square annotations.

For mentioned custom format, I need width and height of square (which can be rectangle). All annotations can be rotated. For rotated rectangle I am able to get bounding box using square.GetVisibleContentBox() and rotation angle using suchh approach:

        var appearance = square.GetAppearance();
        var matrixApp = appearance.FindObj("Matrix");
        var matrixObject = new Matrix2D(matrixApp.GetAt(0).GetNumber(), matrixApp.GetAt(1).GetNumber(),
            matrixApp.GetAt(2).GetNumber(), matrixApp.GetAt(3).GetNumber(), matrixApp.GetAt(4).GetNumber(),
            matrixApp.GetAt(5).GetNumber());

        RotationAngle = GetRotationFromMatrix(matrixObject);

The same approach I use in case of rotated Ellipse annotation (I need semi-major and semi-minor axes for Ellipse). But how can I get Rectangle width and height, or Ellipse axes from bounding box and rotation? I have tried simple math in case of rectangle, using this post. But it does not work with 45 degree rotation. And I have no idea how to retrieve Ellipse axes.

I have opened pdfDoc and found this for rotated Circle annotation:

endstream
endobj
497 0 obj<</Subj(Ellipse)/Type/Annot/P 477 0 R/F 4/C[1 0 0]/CreationDate(D:20180130093056+03'00')/T(User-PC)/Subtype/Circle/M(D:20180130093101+03'00')/AP<</N 499 0 R>>/RD[0.5 0.5 0.5 0.5]/Rect[75.96057 481.4219 511.1196 824.4295]/NM(BNJOBSFLFNHJWWZE)/Rotation 30>>
endobj
498 0 obj[497 0 R]
endobj
499 0 obj<</Type/XObject/Subtype/Form/FormType 1/BBox[88.18503 573.4521 498.8952 732.3994]/Resources<</ProcSet[/PDF]>>/Matrix[0.8660253 -0.5000002 0.5000002 0.8660253 -363.0966 -247.1763]/Filter/FlateDecode/Length 116>>
stream

Pay attention to obj<</Type/XObject/Subtype/Form/FormType 1/BBox string. This BBox is original Ellipse bounding box (without rotation). I have checked this. If I have not rotated BBox I can get axes of ellipse and dimensions of Rectangle. But how to retrieve this XObject for annotation?

To make a summary. I need to retrieve real dimensions of Rectangle and Circle. It is hard to do using simple math. I have found out, that original bounding boxes are saved in pdf, but I do not know how to get this information from Annot object. Or maybe you will give me another approach to get dimensions?

EDIT You can download sample file here

Stalso
  • 1,468
  • 2
  • 12
  • 21
  • Perhaps you could post/attach a sample PDF with the issue. Also, why do you need to retrieve the "real dimensions of Rectangle and Circle"? – Ryan Jan 30 '18 at 17:17
  • As I explained, we need to convert annotations to our custom format. I know, that we can rotate it by applying rotation matrix to appearance. In fact I had an idea to rotate annotation back, using rotation matrix, read the bounding box, and then return it to initial state. But I have failed with it. I have provided file in my question – Stalso Jan 30 '18 at 17:34
  • As I have described in question, I found the place, where original (not rotated) bounding box is saved. But I do not know how to read this information. I really need help with it – Stalso Jan 30 '18 at 17:47
  • Regarding the attached file, I have never seen a "Rotation" entry before in a PDF. Only "Rotate" which is only in 90 degree increments clockwise. The annotation also has a custom appearance, so if any other PDF vendor needs to redraw the annotation, it will lose that 45 degree rotation. Try moving and resizing that annotation in Acrobat to see what I mean. – Ryan Jan 31 '18 at 22:39

1 Answers1

0

Thank you for the sample file. The annotation contains a Rotation value, which is not part of the PDF standard, and I am not aware of any other PDF vendors handling this. Assuming it follows the PDF standard convention, then this angle represents clockwise rotation. To answer your question, there are two ways. There is the simple way, and then there is the more complicated, but more reliable way.

The first way, is to just assume that displayed rectangle touches the edges of the annotation BBox. So using the SO answer you linked to in your question, the variables would be the following.

double cw_rotation_in_degrees = annot.GetSDFObj().FindObj("Rotation").GetNumber();
double t = (360.0 - cw_rotation_in_degrees) / 180.0 * PI;
double bx = annot.GetRect().Width();
double by = annot.GetRect().Height();

The second, harder way, if you don't trust the conditional above, is to use the PDFNet ElementReader sample code to read the raw path commands, and read the current GState transformation. https://www.pdftron.com/pdfnet/samplecode/ElementReaderAdvTest.cs.html

In particular, take note of code line 49 the sample.

This is much more involved, but if the rectangle/ellipse is not exactly touching the edges of the annotation bounding box, then this would be the only certain way to calculate.

Ryan
  • 2,473
  • 1
  • 11
  • 14
  • Annotations can be rotatated using some third party software, Bluebeam Revu, for example. Tjen they have rotatin matrix, as I have described – Stalso Jan 30 '18 at 17:18