I want to strip some html-body code from de full-html code.
I use the script below.
<?php
function getbody($filename) {
$file = file_get_contents($filename);
$bodystartpattern = ".*<body>";
$bodyendpattern = "</body>.*";
$noheader = eregi_replace($bodystartpattern, "", $file);
$noheader = eregi_replace($bodyendpattern, "", $noheader);
return $noheader;
}
$bodycontent = getbody($_GET['url']);
?>
But in some cases the tag <body>
doesn't exist literally, but the tag could be <body style="margin:0;">
or something. Who can tell me what is the solution to find the body-tag in this case by using a regular expression in the $bodystartpattern which looks for the closing-">" of the opening-body-tag?