1

i have this Html code:

<div id="top" style="something i dont know">
Text
</div>

And i only want to get the string "Text". My script looks like this:

Regex search_string = new Regex("<div id=\"top\".*?>([^<]+)</div>");
Match match = search_string.Match(code);
string section = match.Groups[0].Value;
MessageBox.Show(section);

Is this even possible with C#?

sevi
  • 31
  • 1
  • 2
  • possible duplicate of [Extract Content from Div Tag C# RegEx](http://stackoverflow.com/questions/4775265/extract-content-from-div-tag-c-regex) – Jim Mischel Feb 04 '11 at 16:39
  • 2
    Parsing HTML with regex is generally a bad idea. See http://stackoverflow.com/questions/4775265/extract-content-from-div-tag-c-regex, among many others. – Jim Mischel Feb 04 '11 at 16:40

2 Answers2

0

use XPath its much easier

http://www.codeproject.com/KB/cpp/myXPath.aspx

use this as xpath selector

//div[@id='top']

then u can get inner value

Bonshington
  • 3,970
  • 2
  • 25
  • 20
0

You should better use XPath as mentioned before. To be able to work with HTML as with XML you can use HTML Agility Pack, which is very useful for tasks like yours.

EvgK
  • 1,907
  • 12
  • 10