0

How to split PDF file in the middle of any page?

In the text on one of the pages there is an "@placeholder", where I want to split the PDF at the place where it appears.

enter image description here

It is possible with iText or PDFBox? Or some other library?

i_p
  • 29
  • 1
  • You are aware that a PDF can be "painted" in any order - there is no requirement to start at the top and end at the bottom? – Thorbjørn Ravn Andersen May 25 '17 at 16:00
  • Unfortunately I know. I need to create a mechanism that will find "@placeholder" in a PDF file (containing tables, text, images, etc.), and will divide the document exactly here. All the split examples I've found breaks dokument on whole pages. – i_p May 25 '17 at 16:11
  • In PDFBox you could duplicate the page and create a cropbox that goes to the appropriate locations. – Tilman Hausherr May 25 '17 at 16:24
  • Yes it's possible, but first I have to find the appropriate locations. So how to find location of word in file? – i_p May 26 '17 at 06:12
  • You can find an example how to split pages using iText in my `PdfVeryDenseMergeTool` example in [this answer](https://stackoverflow.com/a/29078954/1729265). Also you can see there how split positions can be found using a render listener. In your case you need a text extraction strategy (which is a special render listener, one only interested in text commands) to find the placeholder position. – mkl May 26 '17 at 09:25
  • In PDFBox you'd have to create the detection mechanism yourself, see the PrintTextLocations.java example. You'd know the location of each character of "@placeholder" and would have to recognise that you found the whole word. – Tilman Hausherr May 27 '17 at 11:28
  • @i_p *"How to split PDF file in the middle of any page"* - by the way, does this imply your document may have more pages? – mkl Jun 26 '17 at 20:33
  • @mkl yes, place to cut can be on first, second or other page – i_p Jun 29 '17 at 14:35

0 Answers0