2

{Yup, the above more or less explains it} :)

Regex oRegex = new Regex("<body.*?>(.*?)</body>", RegexOptions.Multiline);

The above doesnt seem to work if the body has any attributes in it.

Stephen
  • 537
  • 1
  • 5
  • 13

3 Answers3

10

With the HTML Agility Pack (assuming it is html, not xhtml):

HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);
string body = doc.DocumentNode.SelectSingleNode("/html/body").InnerHtml;
Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
4

Don't use a regular expression. Use something that's meant to parse XML/HTML:

XmlDocument.SelectSingleNode("//body").InnerXml;

Load your string into an XmlDocument, use the SelectSingleNode function (which takes an XPath expression as a parameter), then extract what you need from the resulting XmlNode.

Welbog
  • 59,154
  • 9
  • 110
  • 123
1

I solved it eventually by using RegexOptions.Singleline instead of using RegexOptions.Multiline

Stephen
  • 537
  • 1
  • 5
  • 13