0

How to detect the horizontal lines and their location in a PDF page using PDFBox version 3.0.

Horizontal lines in PDFBox version1.x

I tried to follow the above link and implement the solution but, a lot has changed since then, so it is difficult to implement the above solution.

I am unable to upload a sample PDF as it is confidential but, the scenarios include horizontal table border lines, lines below the text - single and double lines.

Amit Singh
  • 67
  • 1
  • 5
  • 3
    https://stackoverflow.com/questions/38931422/pdfbox-2-0-2-calling-of-pagedrawer-processpage-method-caught-exceptions – Tilman Hausherr Apr 30 '22 at 17:57
  • Amit, have you been able to find the lines in your PDF using the mechanisms from [@Tilman's answer](https://stackoverflow.com/a/38933039/1729265) and [this follow-up](https://stackoverflow.com/a/55223836/1729265)? – mkl May 01 '22 at 14:10
  • @mkl , I have been able to read the horizontal lines from the PDF page and their location. Unfortunate, I am seeing a weird behaviour with PDFBox, it is splitting the numbers in two parts(not all of them though) like for example 25,000 to 2 and 5,000 when I apply sortToPosition true. Any idea around this behavior? – Amit Singh May 02 '22 at 20:40
  • I'm not aware of ever having observed something like that. Please share enough code and an example PDF to allow reproducing the issue. – mkl May 02 '22 at 21:44
  • @mkl I have opened a question for the same, as it required me to put more details and the comment section did not allow me. link - [link] (https://stackoverflow.com/questions/72093362/pdbbox-number-split-when-setting-the-setsortbyposition-to-true) – Amit Singh May 02 '22 at 23:26
  • Yes, but unfortunately without the PDF. Without the PDF all we can do is speculate and guess. – mkl May 03 '22 at 04:53

1 Answers1

0

@Tilman Hausherr Thank you for refering the link to implement the horizontal line detection. I am successfully able to indetify the horizontal lines and their location. Line detection and location

Amit Singh
  • 67
  • 1
  • 5
  • glad it works, but then it's obviously a duplicate, so the better solution would be to upvote that one and delete your question. Or share your code if it is different or better than the one there. – Tilman Hausherr May 03 '22 at 03:47