I have a C# module that extracts information from a HTML file. But my input is a MHT file. How do I go about extracting just the html portion of the MHT file?
Asked
Active
Viewed 2,092 times
2
-
2MHTML files are _Mime HTML_ files. You need a Mime parser/decoder. [Related question](http://stackoverflow.com/questions/3876406/basic-c-sharp-mime-decoding) – M.Babcock Mar 06 '12 at 20:38
-
Thanks for pointing me in the right direction! – Dan Bailiff Mar 14 '12 at 21:36
1 Answers
1
I tried several tools & libraries that reportedly allowed me to extract the contents of a MHT, but almost all failed (I found that the provider of the MHT files did not encode some types correctly). I eventually discovered Total Commander which let me unpack the MHT and extract just the html portion. It was a hack, but it got the job done.
It would seem that there are many tools for creating MHTs and few for unpacking them.

Dan Bailiff
- 1,513
- 5
- 23
- 38