3

I have some html stored in database. I dont know that html stored in databse has extra closing div like </div> or not. I want to find extra closing div in html string. I have tried to find using HTML Agility pack but not find the way to achieve this. Example:

<div class="readers">
    A total of 218 users are reading this article.
</div>
</div>
</div>

How can i find these two extra closing div and extract fully valid html.

Suneel Gupta
  • 497
  • 5
  • 7
  • Have you tried [w3 HTML Validator](http://validator.w3.org/) ? – Teneff Jun 21 '12 at 07:59
  • An off-the-shelf HTML parser should be able to highlight problems such as extra ending tags. Are you looking specifically for extra ending div tags, or for _any_ syntax problems? – Ray Toal Jun 21 '12 at 07:59
  • Do you just want to find the extra s? Or do you want to just fix the HTML? – rikitikitik Jun 21 '12 at 10:25

2 Answers2

0

Use this pure javascript parser before rendering the html: http://ejohn.org/blog/pure-javascript-html-parser/

You can check out by pasting your code here, http://ejohn.org/apps/htmlparser/ it removes the extra </div>s.

You just need to pass your html to the HTMLtoXML function as:

HTMLtoXML(your_html);

and it would remove the extra closing tags. Infact what it does is that it converts it into xml format, but since you are dealing with html strigs & all tags are expected to be valid in html, you can be safe to use this.

EDIT: You can easily call javascript functions from a C# file. See this question for more details.

Community
  • 1
  • 1
gopi1410
  • 6,567
  • 9
  • 41
  • 75
  • Hi Gopi, I like this but i am fetching html in .cs file. So i want html parser that can work in .cs file. – Suneel Gupta Jun 21 '12 at 08:37
  • @user1221708 there are many methods to call a javascript function from a C# file. See this for details: http://stackoverflow.com/questions/6775606/how-to-call-javascript-jquery-function-from-cs-file-c – gopi1410 Jun 21 '12 at 08:43
  • Hi Gopi, But i have to extract valid html in cs file and need to do some other operation. please help if you have other idea to extract valid html. Thanks for last response. – Suneel Gupta Jun 21 '12 at 11:10
  • dont have much idea about that, but try googling for asp.net html parser, gives some useful results.. – gopi1410 Jun 21 '12 at 11:17
-1

Click here to find both unclosed (hanging) as well as extra div tags: tormus

Ramaanand
  • 27
  • 6