Possible Duplicate:
What is the best way to parse html in C#?
I would like to extract the structure of the HTML document - so the tags are more important than the content. Ideally, it would be able to cope reasonably with badly-formed HTML to some extent also.
Anyone know of a reliable and efficient parser?