20
import pypandoc
output = pypandoc.convert_file('file.html', 'docx', outputfile="file1.docx")
assert output == ""

It is generating new docx file but ignoring styles.

Can any one tell me how to generate new docx file with styles?

Thanks in advance for your answers.

Jayasri Tanneru
  • 301
  • 1
  • 2
  • 7
  • 1
    Check this http://stackoverflow.com/questions/1035183/how-can-i-create-a-word-document-using-python and http://python-docx.readthedocs.io/en/latest/user/styles-using.html – Farmer Mar 14 '17 at 06:40
  • Does this answer your question? [html to .doc converter in Python?](https://stackoverflow.com/questions/4226095/html-to-doc-converter-in-python) – Rene Jan 12 '22 at 14:59

2 Answers2

14

In Windows the easiest way will be to use MS Word using pywin32 plugin. Here is good answer with example code.

Using pypandoc:

output = pypandoc.convert(source='/path/to/file.html', format='html', to='docx', outputfile='/path/to/output.docx', extra_args=['-RTS'])

Read this for extra_args.

Community
  • 1
  • 1
Emin Mastizada
  • 1,375
  • 2
  • 15
  • 30
11

You can also use htmldocx in python 3.x:

from htmldocx import HtmlToDocx

new_parser = HtmlToDocx()
new_parser.parse_html_file("html_filename", "docx_filename")
#Files extensions not needed, but tolerated
Rene
  • 976
  • 1
  • 13
  • 25
Synthase
  • 5,849
  • 2
  • 12
  • 34