1

I'm trying to take an existing php file which I've built for a page of my site (blue.php), and grab the parts I really want with some xPath to create a different version of that page (blue-2.php).

I've been successful in pulling in my existing .php file with

$documentSource = file_get_contents("http://mysite.com/blue.php");

I can alter an attribute, and have my changes reflected correctly within blue-2.php, for example:

$xpath->query("//div[@class='walk']");
foreach ($xpath->query("//div[@class='walk']") as $node) {
$source = $node->getAttribute('class');
$node->setAttribute('class', 'run');

With my current code, I'm limited to making changes like in the example above. What I really want to be able to do is remove/exclude certain divs and other elements from showing on my new php page (blue-2.php).

By using echo $doc->saveHTML(); at the end of my code, it appears that everything from blue.php is included in blue-2.php's output, when I only want to output certain elements, while excluding others.

So the essence of my question is:

Can I parse an entire page using $documentSource = file_get_contents("http://mysite.com/blue.php");, and pick and choose (include and exclude) which elements show on my new page, with xPath? Or am I limited to only making modifications to the existing code like in my 'div class walk/run' example above?

Thank you for any guidance.

rocky
  • 464
  • 6
  • 22
  • Depending on exactly what transformations you want to make, you may be better off using XSL than XPath and the DOM. – Francis Avila Feb 07 '12 at 21:23
  • Hey Francis, thanks for the comment. Could you please tell me why XSL may be a better choice than xPath/DOM? Wouldn't I have to convert my existing PHP file into an XML file, and then convert it back, in order to use XSL? – rocky Feb 07 '12 at 21:45
  • Use your PHP `DOMDocument` instance with [`XSLTProcessor`](http://php.net/manual/en/class.xsltprocessor.php)--there's no conversion step. XSL can very easily express the "keep everything the same except add/modify/remove these kinds of things" pattern (see [this identity template tutorial](http://xmlplease.com/xsltidentity)). If you are doing anything more complex than what you describe here, XSL may be a more compact and natural way to do it. – Francis Avila Feb 07 '12 at 22:08
  • Although, is there some reason you can't edit or refactor `blue.php` in some way to get it to generate your `blue-2` output? This workflow seems a bit convoluted, but I'm not sure of your ultimate goal. – Francis Avila Feb 07 '12 at 22:11

1 Answers1

3

I've tried this, and it just throws errors:

$xpath->query("//img[@src='blue.png']")->remove();

What part of the documentation did make you think remove is a method of DOMNodeList? Use DOMNode::removeChild

foreach($xpath->query("//img[@src='blue.png']") as $node){
    $node->parentNode->removeChild($node);
}

I would suggest browsing a bit through all classes & functions from the DOM extension (which is not PHP-only BTW), to get a bit of a feel what to find where.

On a side note: is probably very more resource efficient if you could get a switch in your original blue.php resulting in the different output, because this solution (extra http-request, full DOM load & manipulation) has a LOT of unneeded overhead compared to that.

Wrikken
  • 69,272
  • 8
  • 97
  • 136
  • Thank you for the answer Wrikken. I agree that my method seems to be a little too heavy on the server. Can you please give me an example of a switch for my original PHP file? I'm not sure exactly what you mean. – rocky Feb 07 '12 at 21:48
  • Also, you've shown me how to remove/exclude certain elements with xPath. Is it possible to instead, include only certain elements from my original PHP file? To me that seems like it would be much less resource-intensive. – rocky Feb 07 '12 at 21:52
  • Without seeing it? Unlikely. But possibly bleu2.php=``, and just have some if-statements in `blue.php`: `if(!$is2){ /*echo something*/ } else {/*echo something else, or nothing if you want */}` – Wrikken Feb 07 '12 at 21:53
  • OK gotcha. I've been playing around with SimpleDOM http://simplehtmldom.sourceforge.net/, and this looks like it may be what I need. Thanks for the help! – rocky Feb 07 '12 at 23:11
  • `DOM` is [INFINITELY FASTER then simplehtmldom](http://stackoverflow.com/questions/3419138/how-to-use-preg-in-php-to-add-html-properties/3419149#3419149). If you find DOM to hard, consider [one of the alternatives](http://stackoverflow.com/questions/3577641/how-to-parse-and-process-html-with-php/3577662#3577662) – Wrikken Feb 07 '12 at 23:17