0

I have a webpage source code which has several occurrences of

<div class="detName">some unpredictable text</div>

I want to be able to get a colleection of all some unpredictable text.

I tried something like:

var match = Regex.Match(pageSourceCode, @"<div class='detName'>/(A-Za-z0-9\-]+)\</div>", RegexOptions.IgnoreCase);

But had no success, what would be a good solution for this issue?

Bruno Klein
  • 3,217
  • 5
  • 29
  • 39

2 Answers2

2

Don't use regex to parse HTML, you can use HTML Agility Pack:

string html = "<div class=\"detName\">some unpredictable text</div>";
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
HtmlAgilityPack.HtmlNodeCollection nodes = doc.DocumentNode.SelectNodes("//div[contains(@class,'detName')]");
foreach (var node in nodes)
{
    Console.WriteLine(node.InnerText);
} 
Community
  • 1
  • 1
Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
0
var match = Regex.Match(pageSourceCode, @"(?<=<div class='detName'>)(.*)(?=</div>)", RegexOptions.IgnoreCase);
Robert Gannon
  • 253
  • 4
  • 13