I'd need to convert a .docx file with colored words into html. I've tried the mammoth library but the colors are lost. How could I achieve my goal?
Asked
Active
Viewed 3,163 times
-2
-
1Does this answer your question? [How do you convert a Word Document into very simple html in Python?](https://stackoverflow.com/questions/1596911/how-do-you-convert-a-word-document-into-very-simple-html-in-python) – Yevhen Kuzmovych Mar 12 '21 at 17:28
-
Is it just one file? Have you tried uploading it to google docs and downloading it as HTML? – Yevhen Kuzmovych Mar 12 '21 at 17:29
-
If you only need to convert one file, open the docx document in Word and ```Save As``` as an HTML file. – Blackjack Mar 12 '21 at 17:32
-
No, it's not just one file. The idea is to automate the process. – Javi Torre Mar 12 '21 at 17:34
-
give [`mammoth`](https://pypi.org/project/mammoth/) a try – RJ Adriaansen Mar 12 '21 at 17:38
-
I tried mammoth but lost the coloring. – Javi Torre Mar 12 '21 at 17:43
1 Answers
4
import win32com.client
doc = win32com.client.GetObject("demo.docx")
doc.SaveAs (FileName="hey.html", FileFormat=8)
doc.Close ()

FugitiveMemories
- 175
- 8
-
Actually I'd need to save the html code into a txt file. Would that be possible? – Javi Torre Mar 12 '21 at 18:00
-
1I don't understand. The OP says docx to Html. If you need to save an html code to txt, Use [BeautifulSoup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/). – FugitiveMemories Mar 13 '21 at 05:56
-
I would first need to convert the docx file to an html variable and then deploy that variable into a txt file. Can I do those steps with beautiful soup? – Javi Torre Mar 13 '21 at 08:29
-
Yes, indeed. It would have been easier for everyone if you had given some context to your problem. `soup = BeautifulSoup(open("path_to_hey.html"), "html.parser")` And then read [here](https://stackoverflow.com/questions/14694482/converting-html-to-text-with-python) – FugitiveMemories Mar 13 '21 at 08:41