-4

I have a long c# string of HTML code and I want to specifically extract bullet points "<ul><li></li></ul>".

Say I have the following HTML string.

var html = "<div class=ClassC441AA82DA8C5C23878D8>Here is a text that should be ignored.</div>This text should be ignored too<br><ul><li>*&nbsp;&nbsp;Need this one</li><li>Another bullet point I need</li><li>A bulletpoint again that I want</li><li>And this is the last bullet I want</li></ul><div>Ignore this line and text</div><p>Ignore this as well.</p>Text not important."

I need everything between the '<ul>' to '</ul>' tags. The '<ul>' tag can be excluded.

Now regular expression is not my strongest side, but if that can be used I need some help. My code is in c#.

Colin Mackay
  • 18,736
  • 7
  • 61
  • 88
codingjoe
  • 707
  • 5
  • 15
  • 32
  • 3
    Did you try anything? Take a look at Html Agility Pack. – CodeCaster Jun 28 '13 at 07:58
  • 5
    Use a parser, such as the [HTML Agility Pack](http://htmlagilitypack.codeplex.com/), not a regex. – Damien_The_Unbeliever Jun 28 '13 at 07:58
  • You should really, really read this answer :D http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – Lorenzo Dematté Jun 28 '13 at 08:25
  • 1
    And (serious again) this post http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html and this answer (explaining why) http://stackoverflow.com/a/590789/863564 – Lorenzo Dematté Jun 28 '13 at 08:25
  • @LorenzoDematté I think I get the message :) – codingjoe Jun 28 '13 at 08:40
  • 1
    The second comment is the relevant one.. but the first answer makes me crack every time :) Sorry, I couldn't resist linking to it, it's a bit of SO history – Lorenzo Dematté Jun 28 '13 at 08:43
  • I would not say that it is a duplicate question, since I wasn't looking for a specific library to help my situation but an idea of using regexp for this, which I can see is not a good approach. – codingjoe Jun 28 '13 at 08:43

1 Answers1

3

You should use the HtmlAgilityPack for things like this. I wrote a little introduction to it a while ago that may help you get going: http://colinmackay.scot/2011/03/22/a-quick-intro-to-the-html-agility-pack/

Colin Mackay
  • 18,736
  • 7
  • 61
  • 88