I was wondering if there is a library in .Net to clean up and remove unclosed tags in an html document?
Asked
Active
Viewed 5,236 times
2 Answers
3
html agility pack

Luke Schafer
- 9,209
- 2
- 28
- 29
-
Sorry to bother you again, I've tried to use Html Agility Pack but was not successful, what I did is to create a new HtmlDocument passing the string containing the html I want to fix in the constructor, however, I need to return the document as string which I dont know how to do it – ryudice Dec 02 '09 at 03:03
-
I parsed my text using the HtmlDocument class but it still leaves unclosed tags there, is there a way to remove them? – ryudice Dec 02 '09 at 03:13
-
Off the top of my head I can't remember, but try outputasxml, or there's another option on there to fix nested tags but I'm not sure under what circumstances it works. – Luke Schafer Dec 02 '09 at 04:34
-
Luke, I believe your referring to the answer I just gave to my own question. http://stackoverflow.com/questions/2175071/how-would-i-get-the-inputs-from-a-certain-form-with-htmlagility-pack-lang-c-ne – codygman Feb 04 '10 at 15:05
-
I wasn't, I've used it before, but that's a great post and thanks for sharing – Luke Schafer Feb 04 '10 at 23:26
2
HtmlTidy!
See the url below for more details:
http://www.devx.com/dotnet/Article/20505/0/page/2
The source of the download/project is:
I gave the other link because it contains information about a .net wrapper and setting everything up. Hope this helps!

codygman
- 832
- 1
- 13
- 30
-
1For C# the specific link is a project maintained by Mark Beaton, called TidyManaged https://github.com/markbeaton/TidyManaged – wonea Jan 10 '11 at 17:33