1

I just found the Mammoth Python package a couple of days ago and its a great tool which really creates clean HTML code from a Word doc. Its nearly perfect. There is just one artifact I don’t understand. The heading elements (h1-h6) it creates from the Word headings contain several <a> elements with strange TOC ids. Looks like this:

<h1><a id="_Toc48228035"></a><a id="_Toc48288791"></a><a id="_Toc48303673"></a><a id="_Toc48306159"></a><a id="_Toc48308644"></a><a id="_Toc48311128"></a><a id="_Toc48313611"></a>Arteriosklerose</h1>

Does anybody know how the get rid of these?

Thanks in advance

Cheers, Peter

Peter Ebel
  • 11
  • 2

1 Answers1

0

This is just a guess, but I hope it helps:

TOC stands most probably for "Table of Content". When you want to skip to an element in the page, (like a certain Chapter), you give the chapter an ID and append #ID to your url. In this way the browser would scroll directly to that point.

I guess you are using a table of content somehow and it has links in it and when you inspect them you fill find something like <a href="#_Toc48228035">Arteriosklerose</a>

cagcoach
  • 625
  • 7
  • 24