Im working to make Docx PDFs (without word nor microsoft), but I found nothing to do it, then I tried to do the whole process with PDF that is reeplacing a text in a PDF, but this code doesnt work either. I need to do it either way or the other buy I find nothing that works at all. Could someone help here?
Thanks
Thats the code im trying to change the text in the PDF itself, is not working and if it does it will hopefully without destroying the disposition of the images, fonts etc... So the best for me would be a way to go from Docx to PDF directly.
import PyPDF2
def replace_text_in_pdf(input_path, output_path, old_text, new_text):
pdf = PyPDF2.PdfReader(open(input_path, 'rb'))
writer = PyPDF2.PdfWriter()
for page_number in range(len(pdf.pages)):
page = pdf.pages[page_number]
content = page.extract_text()
modified_content = content.replace(old_text, new_text)
media_box = page.mediabox
new_page = PyPDF2.PageObject.create_blank_page(None, width=media_box[2], height=media_box[3])
page._data = modified_content.encode()
new_page.merge_page(page)
writer.add_page(new_page)
with open(output_path, 'wb') as output_file:
writer.write(output_file)
input_file_path = '/Users/eden/Downloads/downloaded-4.pdf'
output_file_path = '/Users/eden/Downloads/MAKINOTEEE.pdf'
old_text = 'Hola'
new_text = 'POTORRO'
replace_text_in_pdf(input_file_path, output_file_path, old_text, new_text)