Find text between know pattern

Question

I have a webpage source code which has several occurrences of

<div class="detName">some unpredictable text</div>

I want to be able to get a colleection of all some unpredictable text.

I tried something like:

var match = Regex.Match(pageSourceCode, @"<div class='detName'>/(A-Za-z0-9\-]+)\</div>", RegexOptions.IgnoreCase);

But had no success, what would be a good solution for this issue?

score 2 · Accepted Answer · edited May 23 '17 at 11:50

Don't use regex to parse HTML, you can use HTML Agility Pack:

string html = "<div class=\"detName\">some unpredictable text</div>";
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
HtmlAgilityPack.HtmlNodeCollection nodes = doc.DocumentNode.SelectNodes("//div[contains(@class,'detName')]");
foreach (var node in nodes)
{
    Console.WriteLine(node.InnerText);
}

score 0 · Answer 2 · answered Jun 29 '13 at 23:38

0

var match = Regex.Match(pageSourceCode, @"(?<=<div class='detName'>)(.*)(?=</div>)", RegexOptions.IgnoreCase);

answered Jun 29 '13 at 23:38

Robert Gannon

253
4
13

Find text between know pattern

2 Answers2