I want to split a pdf based off a value on every page. Every value should be in its own pdf file. I currently have the following list where all values with the pages are displayed:
l = [
{'abr': '123 ', 'page': 1},
{'abr': '125 ', 'page': 2},
{'abr': '125 ', 'page': 3},
{'abr': '140 ', 'page': 4},
{'abr': '142 ', 'page': 5},
]
I want to "merge" the dicts so that every "abr" is uniqe inside the list and i have every page of that specific abr added to a list.
I thought of something like the following:
l = [
{'abr': '123 ', 'page': [1]},
{'abr': '125 ', 'page': [2, 3]},
{'abr': '140 ', 'page': [4]},
{'abr': '142 ', 'page': [5]},
]
Thats because i need to have a for loop for every abr where i can get every page so can do something like:
pdf = PdfFileReader(path)
for abr in l:
pdf_writer = PdfFileWriter()
for page in abr["page"]:
pdf_writer.addPage(pdf.getPage(page))
with open(output_filename, 'wb') as out:
pdf_writer.write(out)
Is there a good / simple way to do this or has anyone a better way to structure the data or can we split it easier?