Docx to PDF in python

Question

Im working to make Docx PDFs (without word nor microsoft), but I found nothing to do it, then I tried to do the whole process with PDF that is reeplacing a text in a PDF, but this code doesnt work either. I need to do it either way or the other buy I find nothing that works at all. Could someone help here?

Thanks

Thats the code im trying to change the text in the PDF itself, is not working and if it does it will hopefully without destroying the disposition of the images, fonts etc... So the best for me would be a way to go from Docx to PDF directly.


import PyPDF2

def replace_text_in_pdf(input_path, output_path, old_text, new_text):
    pdf = PyPDF2.PdfReader(open(input_path, 'rb'))
    writer = PyPDF2.PdfWriter()

    for page_number in range(len(pdf.pages)):
        page = pdf.pages[page_number]
        content = page.extract_text()
        modified_content = content.replace(old_text, new_text)

        media_box = page.mediabox
        new_page = PyPDF2.PageObject.create_blank_page(None, width=media_box[2], height=media_box[3])
        page._data = modified_content.encode()
        new_page.merge_page(page)

        
        writer.add_page(new_page)

    with open(output_path, 'wb') as output_file:
        writer.write(output_file)
        


input_file_path = '/Users/eden/Downloads/downloaded-4.pdf'
output_file_path = '/Users/eden/Downloads/MAKINOTEEE.pdf'
old_text = 'Hola'
new_text = 'POTORRO'

replace_text_in_pdf(input_file_path, output_file_path, old_text, new_text)

https://www.geeksforgeeks.org/convert-docx-to-pdf-usinf-docx2pdf-module-in-python/ — FlyingTeller, Jun 20 '23 at 14:07
docx2pdf uses word as a bridge to do it, so I cannot use it. — Reel Alfer, Jun 20 '23 at 14:11
What about the other suggestion from the linked question msoffice2pdf? Works with liibreoffice — FlyingTeller, Jun 20 '23 at 14:14
Libreoffice destroy the some of my styles and images in my docx — Reel Alfer, Jun 20 '23 at 14:37
It's difficult to prove a negative, but I am convinced that you will find no perfect solution for a word document that is not word... LibreOffice is already a huge competitor and you can see the result of them trying to read word documents as best as possible... — FlyingTeller, Jun 20 '23 at 14:41

Docx to PDF in python

0 Answers0