-1

I'm using following code for extracting the contents from a div with this format: <div id="post-contents"></div>

string findtext2 = @"<div[^>]*\\id=\post-contents\[^>]*>(.*?)</div>";
string myregex2 = txt;
MatchCollection doregex2 = Regex.Matches(myregex2, findtext2);
string matches2 = "";
foreach (Match match2 in doregex2)
{
    matches2 = (matches2 + (match2.ToString()));
}
return matches2;

But I got some errors with HTML tags. Actually the tag contains some other HTML tags as follow:

<div id="post-contents"><p dir="ltr">HI HI HI</p></div>

May you please help me how can I get just <p dir="ltr">HI HI HI</p>?

Thank you

Ali
  • 21
  • 5

1 Answers1

0

Your regex works well in the described case: https://regex101.com/r/jbDN1U/1. But your can't handle cases like this with regexp:

<div id="post-contents"><div dir="ltr">HI HI HI</div></div>

Regexp can't determine which closing div to chose in this case. As was mentioned in comments consider using XML parser.

Alexander
  • 11
  • 2