Alternatives to PHP getElementById() of DOMDocument?

Question

EDIT: related to PHP HTML DomDocument getElementById problems

Can't get getElementById() to work. Document is valid, according to using W3C validator. Plus my code should work even with a little invalid HTML.

This simple code looks for an image with id="banner" and replace it's src attribute with another one. Works on my development machine (Windows), doesn't work on server (Ubuntu).

Any idea of how to do this without getElementById()?

    libxml_use_internal_errors(true);

    // Create the DOMDocument and get the HTML content
    $document = new \DOMDocument();

    // Load HTML string and if it fails just return the content itself
    if(false === $document->loadHTML($content)) return $content;

    // Get DOMElement of the image with id="banner"
    $img = $document->getElementById('banner');

    // Return the content if it can't find the image
    if(null === $img) return $content;

    // Get image parent and remove the banner from DOM
    $parent = $img->parentNode;
    $parent->removeChild($img);

    // Set the new src attribute
    $img->setAttribute('src', 'http://mysite.come/img/myimage.png');

    // Append the modified node to banner parent
    $parent->appendChild($img);

    return $document->saveHTML();

You're returning the same thing any time you hit an error condition. That really makes it hard to pinpoint what went wrong when/if it does. — FtDRbwLXw6, Aug 18 '12 at 02:06
@drrcknlsn yes, it's the default behaviour, return $content if anything goes wrong. But the point is getElementById return null and document doesn't validate. — gremo, Aug 18 '12 at 02:09
Just to be more clean again, i don't care if document is invalid. Just replace the src if any, or return content. So simple... — gremo, Aug 18 '12 at 02:10
I guess my point was - how do you know `getElementById()` is returning `NULL`? What kind of debugging did you try? If you're just making that assumption based on the fact that `$content` was returned, then maybe `loadHTML()` returned `FALSE`. That would also cause your script to return `$content`. — FtDRbwLXw6, Aug 18 '12 at 03:05

score 2 · Accepted Answer · answered Aug 18 '12 at 02:32

Got it, don't know how it's invalid-document-proof:

$xpath = new \DOMXpath($document);
$nodes = $xpath->query('//img[@id="banner"]');

// Return content if we don't have exactly one image with id="banner"
if(1 !== $nodes->length) return $content;

// DOMNode of the banner
$banner = $nodes->item(0);

// Set the new src attribute and save the content
$banner->setAttribute('src', 'http://mysite.come/img/myimage.png');
$banner->ownerDocument->saveXML($banner);

return $document->saveXML();

score -2 · Answer 2 · answered Aug 18 '12 at 03:08

-2

From the DOMDocument::getElementById() documentation:

For this function to work, you will need either to set some ID attributes with DOMElement::setIdAttribute or a DTD which defines an attribute to be of type ID. In the later case, you will need to validate your document with DOMDocument::validate or DOMDocument::$validateOnParse before using this function.

answered Aug 18 '12 at 03:08

FtDRbwLXw6

27,774
13
70
107

Document has DTD and validates against W3C validator, but not against DOMDocument. – gremo Aug 18 '12 at 03:47
You didn't call `DOMDocument::validate()` or set `DOMDocument::$validateOnParse` to `TRUE` in your code above like it says you have to. Whether or not it passes W3C validation is irrelevant to how `DOMDocument` functions in PHP... – FtDRbwLXw6 Aug 18 '12 at 03:56

Alternatives to PHP getElementById() of DOMDocument?

2 Answers2