0

There is a sample html code like below:

<div><span>span1</span></div>
<b>for test</b>
<span>span2</span>

Is there any way to get all span tags that are not in div tags (In this sample: span2)

According to this post C# Regular Expression excluding a string this is my pattern but it does not work. pattern: ((?:(?!\b<div>\b))*)((.|\n)*?)<span>((.|\n)*?)</span>((.|\n)*?)((?:(?!\b</div>\b))*)

Community
  • 1
  • 1
Ghooti Farangi
  • 19,926
  • 15
  • 46
  • 61

1 Answers1

3

You really don't want to be using regular expressions to try to parse HTML. You can read more about the many reasons on this Stack Overflow question: RegEx match open tags except XHTML self-contained tags

You should use an HTML parser like Html Agility Pack, or even a simple XML parser like XMLReader

Community
  • 1
  • 1
pauljz
  • 10,803
  • 4
  • 28
  • 33