0

I would like to open an HTML page with MS Word (Microsoft Office Standard 2019). The page contains MathML codes, e.g. this page. If I save it and open with MS Word, the formulas appear as normal text, mathematical formatting is completely lost.

In this post I see a trick to import a MathML formula into MS Word, i.e. copy the code in Wordpad, then copy from wordpad and paste into MS Word. It is just an easter egg, not a real solution. When I try with a full page, indeed, it does not work.

This is very annoying, since MS Word is clearly able to correctly interpret MathML codes but it doesn't when needed. I guess that it is just a commercial decision, to promote the OOXML codes instead of MathML.

Is there a method to convince MS Word to work properly with MathML code? I would like to right-click on a .html file, select "open with Word", and see the correct MathML formulas.

  • 1
    I don't think there is any option to do that. Faced with that problem, I'd probably consider installing pandoc and perhaps writing a little bit of VBA to run it via a shell and import the resulting .docx. – jonsson Feb 17 '23 at 06:12
  • Well, the conclusion is always the same: MS Word is not a professional program and the best solution is to avoid it at all. Besides this, I also have problems with pandoc, e.g. the text in italic or bold is translated into normal font in word. If you have experience on this, do you think that it is usual? – Doriano Brogioli Feb 17 '23 at 08:07
  • I don't - I'd only use it as a starting point because it seems fairly well regarded, is rumoured to solve the Math problem, and is free. As so often with software, it will probably end up being a question of "which problem are you going to have to deal with manually?" – jonsson Feb 17 '23 at 09:14
  • This is another wise suggestion! As a latex user, I am not used to "manually deal with problems". But, yes, having to deal with .docx documents, I should be aware that I will have to manually correct some of the details. – Doriano Brogioli Feb 17 '23 at 09:20
  • Rather an obvious suggestion but if pandoc has problems with simple formatting conversions, trying to improve pandoc might be a good strategy. The probability that Microsoft will change its HTML import/conversion software for any other reason than "fixing a security problem" seems to me to be 0, i.e. "highly unlikely". – jonsson Feb 19 '23 at 21:15
  • Maybe I could open a new question about pandoc. My problem is that it ignores most of the formatting, e.g. the
    tag. For what I understand, it is a feature, i.e. pandoc only considers the "content" tags such as

    but ignores the "styling" tags such as
    . Do you have experiene with this?

    – Doriano Brogioli Feb 20 '23 at 10:06
  • No I don't. Let's assume you've had a good look around at converters but there isn't anything that comes close enough to what you need. What then? Tricky - the kind of options that come to mind either involve coding effort to do whatever bits Word doesn't do, or something bizarre like "print the HTML to PDF then get Word to OCR the output" Can't really see that working but actually I would probably try it once just to find out. – jonsson Feb 20 '23 at 11:33

0 Answers0