I am trying to parse my way out of HTML emails to store their content as intelligible raw text.
HtmlAgilityPack seems well received but leaves me with most of the parsing/interpreting to do, and we're talking rather messy looking HTML.
On the other hand if I load a sample HTML email in IE/Firefox/Chrome they all get the parsing right, and a simple copy/paste gets me the text I want.
There seems to be ways to tap into Trident from C# using a Windows.Forms.WebBrowser but my project being command line based this would be a rather hackish way of doing things.
So my question, in a nutshell: is there a non graphical way to use Trident/Gecko/Chrome to parse HTML into text?