1

I dont have enough reputation score right now to answer a question I found - how to use python to split pdf pages into half and recombine it for further processing ..

#!/usr/bin/env python

'''

Chops each page in half, e.g. if a source were
created in booklet form, you could extract individual
pages, and re-combines it
'''
from PyPDF2 import PdfFileWriter,PdfFileReader,PdfFileMerger
#split left
with open("docu.pdf", "rb") as in_f:
    input1 = PdfFileReader(in_f)
    output = PdfFileWriter()

    numPages = input1.getNumPages()

    for i in range(numPages):
        page = input1.getPage(i)
        page.cropBox.lowerLeft = (60, 50)
        page.cropBox.upperRight = (305, 700)
        output.addPage(page)

    with open("left.pdf", "wb") as out_f:
        output.write(out_f)
#split right
with open("docu.pdf", "rb") as in_f:
    input1 = PdfFileReader(in_f)
    output = PdfFileWriter()

    numPages = input1.getNumPages()

    for i in range(numPages):
        page = input1.getPage(i)
        page.cropBox.lowerLeft = (300, 50)
        page.cropBox.upperRight = (540, 700)
        output.addPage(page)

    with open("right.pdf", "wb") as out_f:
        output.write(out_f)

#combine splitted files
input1 = PdfFileReader(open("left.pdf","rb"))
input2 = PdfFileReader(open("right.pdf","rb"))
output = PdfFileWriter()
numPages = input1.getNumPages()

for i in range(numPages):
    l = input1.getPage(i)
    output.addPage(l)
    r = input2.getPage(i)
    output.addPage(r)

with open("out.pdf", "wb") as out_f:
    output.write(out_f)

Note : The cropping parameters are specific for your PDF , so, please, check it before execution of the program.

Further : Now, You can use this document to extract text easily without getting the columns merged into each other -- messed up extraction ..

Azlan Khan
  • 11
  • 2
  • I'll upvote your question so you can post answers in the future, but please keep in mind that questions are supposed to be for asking questions. Also, please link the original question. – PythonPikachu8 Mar 02 '21 at 01:03
  • Thanks ! I really wanted to share the code that's why I posted it as a question because I wasnt able to post the answer. Sorry for this. – Azlan Khan Mar 02 '21 at 01:04
  • Original question : https://stackoverflow.com/questions/27336586/how-to-split-crop-a-pdf-along-the-middle-using-pypdf – Azlan Khan Mar 02 '21 at 01:06
  • A suggestion is to put the link into the answer itself (you can edit it). – PythonPikachu8 Mar 02 '21 at 01:07
  • I’m voting to close this question because this is not how to use S.O. please don't litter. –  Dec 11 '21 at 18:50
  • I’m voting to close this question because it's an answer to a different question – SuperStormer Sep 27 '22 at 03:45

0 Answers0