I'm trying what should be very easy, but I can't get it to work. Which makes me wonder if I'm using the right workflow.
I have a simple html page which I load in my desktop application as a help file. This page has no menu just the content. On my website I want to have a more sophisticated help system. So I want to use a php file which will show a menu, breadcrums and a header and footer. To not duplicate my help content I want to load the original HTML help file and add its body content to my enhanced help page.
I'm using this code to extract the title:
function getURLContent($filename){
$url = realpath(dirname(__FILE__)) . DIRECTORY_SEPARATOR . $filename;
$doc = new DOMDocument;
$doc->preserveWhiteSpace = FALSE;
@$doc->loadHTMLFile($url);
return $doc;
}
function getSingleElementValue($element){
if (!is_null($element)) {
$node = $element->childNodes->item(0);
return $node->nodeValue;
}
}
$doc = getURLContent("test.html");
$title = getSingleElementValue($doc->getElementsByTagName('title')->item(0));
echo $title;
The title is correctly extracted.
Now I try to extract the body:
function getBodyContent($element){
$mock = new DOMDocument;
foreach ($element->childNodes as $child){
$mock->appendChild($mock->importNode($child, true));
}
return $mock->saveHTML();
}
$body = getBodyContent($doc->getElementsByTagName('body')->item(0));
echo $body;
The getBodyContent() function is one of the several options I tried. All of them return the whole HTML tag, including the HEAD tag.
My question is: Is this a correct workflow or should I use something else?
Thanks.
Update: My final goal is to have a website with multiple pages that has the help files accessible via a menu. These pages will be generated using something like generate.php?page=test.html. I'm not yet at this part. The goal is also to not duplicate the content of test.html because this file will be used in my desktop application (using a web control). In my desktop application I don't need the menu and such.
Update #2: I had to add <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
to the html-file I want to read and now I do get the body content. Unfortunaly all tags are strips. I'll need to fixed that as well.