Get Div Content with Regular Expressions in C#

Question

i have this Html code:

<div id="top" style="something i dont know">
Text
</div>

And i only want to get the string "Text". My script looks like this:

Regex search_string = new Regex("<div id=\"top\".*?>([^<]+)</div>");
Match match = search_string.Match(code);
string section = match.Groups[0].Value;
MessageBox.Show(section);

Is this even possible with C#?

possible duplicate of [Extract Content from Div Tag C# RegEx](http://stackoverflow.com/questions/4775265/extract-content-from-div-tag-c-regex) — Jim Mischel, Feb 04 '11 at 16:39
Parsing HTML with regex is generally a bad idea. See http://stackoverflow.com/questions/4775265/extract-content-from-div-tag-c-regex, among many others. — Jim Mischel, Feb 04 '11 at 16:40

score 0 · Answer 1 · answered Feb 04 '11 at 16:46

0

use XPath its much easier

http://www.codeproject.com/KB/cpp/myXPath.aspx

use this as xpath selector

//div[@id='top']

then u can get inner value

answered Feb 04 '11 at 16:46

Bonshington

3,970
2
25
20

score 0 · Answer 2 · answered Feb 04 '11 at 16:52

0

You should better use XPath as mentioned before. To be able to work with HTML as with XML you can use HTML Agility Pack, which is very useful for tasks like yours.

answered Feb 04 '11 at 16:52

EvgK

1,907
12
10

Get Div Content with Regular Expressions in C#

2 Answers2