0

Say I have the following string.

"<description>This is the description,<strong> I want to retrieve this text</strong></description> and this is not the description."

And I just want to extract the part of the string between the two description tags/strings. I know I can install and use something like html agility pack, but I'd rather not for one purpose task such as this. Also the .net XML parser won't do, because it does not play well with html.

LaserBeak
  • 3,257
  • 10
  • 43
  • 73
  • Maybe its fine for the exact question you posted, or a limited subset, otherwise, [Even Jon Skeet cannot parse HTML using regular expressions](http://stackoverflow.com/q/1732348/304683) – EdSF May 26 '12 at 16:33

3 Answers3

2
var description = Regex.Match(s, @"<description>(.*)</description>").Groups[1];
Tim S.
  • 55,448
  • 7
  • 96
  • 122
1

You can use regex with lookaround to match the opening and closing tags:

string description = 
    Regex.Match(html, @"(?<=<description>).*?(?=</description>)").Value;

However, be careful that this approach is very brittle. For example, it assumes that your <description> elements will never have attributes, be nested, or be self-closing.

Douglas
  • 53,759
  • 13
  • 140
  • 188
0

You can use regex to get string between description tag using following code.

 Regex objPatterntable = new Regex("<description [^>]*?>.*?</description>", RegexOptions.IgnoreCase | RegexOptions.Singleline | RegexOptions.IgnorePatternWhitespace);
R.D.
  • 7,153
  • 7
  • 22
  • 26