0

I have a .doc file that contains text, I need to replace a particular phrase with another one in this document.

I tried using python-docx but it doesn't support the .doc format. I also tried using the normal string replace functionality but it's corrupting the doc file

with open("input.doc") as r:
   text = r.read().replace("old text", "new text")
with open("output.doc", "w") as w:
   w.write(text)

I can't change the extension of the file and I want to do it in python.

macropod
  • 12,757
  • 2
  • 9
  • 21
Hassan Anwer
  • 347
  • 2
  • 14
  • There are some possible pointers here: https://stackoverflow.com/questions/36001482/read-doc-file-with-python - However, you also want to save, so it's not a duplicate. – Hans Jun 09 '21 at 11:44
  • There is also https://stackoverflow.com/questions/4226824/is-it-possible-to-edit-doc-files-with-python, but the question is not correctly answered there, too. – Itai Klapholtz Jun 09 '21 at 11:45
  • @Hans they all are about reading the text from the .doc file but i need to edit it and save it in the same format without corrupting it. – Hassan Anwer Jun 09 '21 at 12:01

1 Answers1

0

I think you must open the files in binary mode:

open("input.doc", "rb")

and

open("output.doc", "wb")

You must change the "replace" method to adequate it to binary mode.