0

I have a docx file stored in cloud storage and I want to convert it to PDF and store it there as well using a python cloud function. Meaning I don't want to download the file, convert it on my local machine and upload it back to storage, so simple stuff like the following code can't be used:

from docx import document
from docx2pdf import convert

doc = document()
doc.save('some_path')
convert(docx_path, pdf_path)

I tried using io.BytesIO like the following example:

import io
from docx import document
from docx2pdf import convert

file_stream = io.BytesIO()
file_stream_pdf = io.BytesIO()
doc = document()
doc.save(file_stream)
file_stream.seek(0)
convert(file_stream, file_stream_pdf)

but I am getting the error:

TypeError: expected str, bytes or os.PathLike object, not BytesIO.

I know convert receives paths as strings, but I dont know how to use BytesIO in this case.

Is there a simple way to fix what I have done so far or perhaps a different approach?

Amit Goft
  • 33
  • 5

1 Answers1

0

Try to write the BytesIO object to a temporary file. See this for more information about tempfile.

import tempfile

temp_docx_file = tempfile.NamedTemporaryFile(suffix=".docx")

temp_docx_path = temp_docx_file.name  
with open(temp_docx_path, "wb") as f:
    f.write(file_stream.read())

Then, create another temporary file to handle new .pdf file.

temp_pdf_file = tempfile.NamedTemporaryFile(suffix=".pdf")

Now, convert

temp_pdf_path = temp_pdf_file.name  
convert(temp_docx_path, temp_pdf_path)

To convert the pdf into the BytesIO object, you can do it by:

with open(temp_pdf_path, "rb") as f:
    file_stream_pdf = io.BytesIO(f.read())
JayPeerachai
  • 3,499
  • 3
  • 14
  • 29
  • When I run `with open(temp_docx_path, "wb") as f: f.write(file_stream.read())` I get the following error: TypeError: expected str, bytes or os.PathLike object, not _TemporaryFileWrapper , do you know how can I fix it? and thank you for your detailed answer @JayPeerachai – Amit Goft Dec 18 '22 at 09:22
  • @AmitGoft I have updated the answer, can you try it again? Please make sure that `temp_docx_path = temp_docx_file.name` before running `with open(temp_docx_path, "wb") as f: f.write(file_stream.read())`. – JayPeerachai Dec 18 '22 at 11:37
  • Thank you, now I am getting this error: PermissionError: [Errno 13] Permission denied: 'some-path\\tmpgdejwu42.docx' @JayPeerachai – Amit Goft Dec 18 '22 at 11:59
  • 1
    You need to check the permission if it allows for writing and reading. – JayPeerachai Dec 19 '22 at 16:04