0

I'm exploring XML and PHP, mostly XPath and other parsers.

Here be the xml:

<?xml version="1.0" encoding="UTF-8"?>

<root xmlns:foo="http://www.foo.org/" xmlns:bar="http://www.bar.org">
    <actors>
        <actor id="1">Christian Bale</actor>
        <actor id="2">Liam Neeson</actor>
        <actor id="3">Michael Caine</actor>
    </actors>
    <foo:singers>
        <foo:singer id="4">Tom Waits</foo:singer>
        <foo:singer id="5">B.B. King</foo:singer>
        <foo:singer id="6">Ray Charles</foo:singer>
    </foo:singers>
    <items>
        <item id="7">Pizza</item>
        <item id="8">Cheese</item>
        <item id="9">Cane</item>
    </items>
</root>

Here be my path & code:

$xml = simplexml_load_file('xpath.xml');

$result = $xml -> xpath('/root/actors');

echo '<pre>'.print_r($result,1).'</pre>';

Now, said path returns:

Array
(
    [0] => SimpleXMLElement Object
        (
            [actor] => Array
                (
                    [0] => Christian Bale
                    [1] => Liam Neeson
                    [2] => Michael Caine
                )
        )
)

Whereas a seemingly similar line of code, which I would have though would result in the singers, doesnt. Meaning:

$result = $xml -> xpath('/root/foo:singers');

Results in:

Array
    (
        [0] => SimpleXMLElement Object
            (
            )

    )

Now I would've thought the foo: namespace in this case is a non-issue and both paths should result in the same sort of array of singers/actors respectively? How come that is not the case?

Thank-you!

Note: As you can probably gather I'm quite new to xml so please be gentle.

Edit: When I go /root/foo:singers/foo:singer I get results, but not before. Also with just /root I only get actors and items as results, foo:singers are completely omitted.

Russ
  • 469
  • 1
  • 7
  • 13
  • 3
    Have you seen a few posts [like this one](http://stackoverflow.com/questions/2098170/php-namespace-simplexml-problems)? The solution with SimpleXML to get elements outside the default namespace is `children()`, like `var_dump($result[0]->children('foo', true));` passing the namespace prefix as the first arg and `true` as the second to indicate it is a prefix and not a full NS. – Michael Berkowski Mar 23 '15 at 20:14
  • 1
    Like @MichaelBerkowski already commented: What you experience is that the element is not within the documents *default namespace*. And `print_r` is not particularily useful with **SimpleXMLElement** in any case it does show you what's in the default namespace at best. – hakre Mar 23 '15 at 23:54

2 Answers2

2

SimpleXML is, for a number of reasons, simply a bad API.

For most purposes I suggest PHP's DOM extension. (Or for very large documents a combination of it along with XMLReader.)

For using namespaces in xpath you'll want to register those you'd like to use, and the prefix you want to use them with, with your xpath processor.


Example:

$dom = new DOMDocument();
$dom->load('xpath.xml');
$xpath = new DOMXPath($dom);

// The prefix *can* match that used in the document, but it's not necessary.
$xpath->registerNamespace("ns", "http://www.foo.org/");

foreach ($xpath->query("/root/ns:singers") as $node) {
    echo $dom->saveXML($node);
}

Output:

<foo:singers>
    <foo:singer id="4">Tom Waits</foo:singer>
    <foo:singer id="5">B.B. King</foo:singer>
    <foo:singer id="6">Ray Charles</foo:singer>
</foo:singers>

DOMXPath::query returns a DOMNodeList containing matched nodes. You can work with it essentially the same way you would in any other language with a DOM implementation.

user3942918
  • 25,539
  • 11
  • 55
  • 67
  • I suggest using `DOMXPath::evaluate()`, not `DOMXPath::query()`. `DOMXPath::query()` doesn't support expressions that return scalar values like `string(/root/ns:singers/ns:singer[@id=5])`. – ThW Mar 24 '15 at 11:22
1

You can use // expression like:

$xml -> xpath( '//foo:singer' );

to select all foo:singer elements no matter where they are.

EDIT:

SimpleXMLElement is selected, you just can't see the child nodes with print_r(). Use SimpleXMLElement methods like SimpleXMLElement::children to access them.

// example 1
$result = $xml->xpath( '/root/foo:singers' );

foreach( $result as $value ) {
    print_r( $value->children( 'foo', TRUE ) );
}

// example 2
print_r( $result[0]->children( 'foo', TRUE )->singer );
Danijel
  • 12,408
  • 5
  • 38
  • 54
  • Oh no I understand that, I know how to work around it. I'm just wondering why its happening. Thanks :) – Russ Mar 23 '15 at 20:24