0

I have a pdf with multiple pages and each page looks like that enter image description here

Is there a way to split each page? I mean to take each part in the page by the title surrounded in red and make it in a new page. So if the pdf has one page like the picture, I would like to have a new pdf with four pages I have searched to find any clue to start but with no use. Please if you will not help completely, just give me links to topics that may be useful for me

I am not trying to parse data from the pdf file. I am trying to cut it out for each page into pieces and each piece would be in a new page. The cut-out part starts at the title and extends for 8 lines

  • An idea I have a deep look at the pdf and I found the structure is already the same How can I split each part depending on the height of the page An example: to make the first part cut-out equals 200 pt. The second and the third part would be 200 pt but the fourth part would be larger to be 300 pt
YasserKhalil
  • 9,138
  • 7
  • 36
  • 95
  • 1
    I don't think its a straight forward option. If you think the document is going to have a structure which is always same, then PyMuPDF gives lot of flexible options to work on a PDF. [See this link](https://github.com/pymupdf/PyMuPDF/issues/708) which is alive as of today. It can give you some idea. – Kris Feb 28 '22 at 10:59
  • 1
    I think this question must not be closed (atleast the link to duplicate is not good). – Kris Feb 28 '22 at 11:01
  • @Kris Thanks a lot. I have edited the question to put an idea. – YasserKhalil Feb 28 '22 at 11:04

0 Answers0