0

I have this XML (from a pptx file):

<Relationships>
    <Relationship Id="rId3" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="../media/image2.png"/>
    <Relationship Id="rId2" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="../media/image1.wmf"/>
    <Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/slideLayout" Target="../slideLayouts/slideLayout1.xml"/>
</Relationships>

I want to pull the Target attribute from a Relationship element, and I know the Id value.

I could do it with SimpleXML if I iterate through the nodes (like this question)

$resxml = simplexml_load_file('zip://my.pptx#ppt/slides/_rels/slide1.xml.rels');
echo $resxml->Relationship[0]->attributes()->Target;

But I would like to get it using xpath using this sort of idea. Whatever I do in xpath returns an empty object when I search for something like 'rId3'. I thought it would be the below xpath statement, but it returns an empty object. I have tried about 50 combimations and found a lot of similar but not identical issues when searching:

$image = $resxml->xpath("/Relationships/Relationship[@Id='rId3']/@Target"); 
print_r($image);

I guess I will just end up iterating through all the nodes but it seems inefficient. My server appears to have XPath in the Dom available and SimpleXML enabled.

Community
  • 1
  • 1
Thomas
  • 11
  • 1
  • 4
  • Thanks Thomas and ThW! The answers were very useful. I was able to grab the target attribute using the r:id that I found from my xml file. I got it to work both with the DOM and the simplexml. I would suggest to anyone attempting to get the string value from a tag and the 'Target' attribute to do so with the DOM Object. Less lines of code. – Travis Smith Jun 07 '17 at 16:41

2 Answers2

1

Thank you. Your excellent answer was the key to me finding the solution. After reading your post, I found elsewhere in Stack exchange that SimpleXML deletes namespace attributes on the first node. I had consdered namespace as the issue but only looked at the simpleXML output when looking at the tree. You put me right when looking at the real source.

My solution just using simple XML looks like this:

$resxml->registerXPathNamespace('r', 'http://schemas.openxmlformats.org/package/2006/relationships');
$image = $resxml->xpath("/r:Relationships/r:Relationship[@Id='rId3']/@Target"); 
print_r($image);
Thomas
  • 11
  • 1
  • 4
0

I think you problem might be the namespace. PPTX Relationship files use the namespace "http://schemas.microsoft.com/package/2005/06/relationships". But SimpleXmls xpath does it's own magic, too. If the file contains the namespace (check the source) you have to register an own prefix for it.

$xml = <<<'XML'
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<Relationships
 xmlns="http://schemas.microsoft.com/package/2005/06/relationships">
 <Relationship Id="rId1"
 Type="http://schemas.microsoft.com/office/2006/relationships/image"
 Target="http://en.wikipedia.org/images/wiki-en.png"
 TargetMode="External" />
 <Relationship Id="rId2"
 Type="http://schemas.microsoft.com/office/2006/relationships/hyperlink"
 Target="http://www.wikipedia.org"
 TargetMode="External" />
</Relationships> 
XML;

$dom = new DOMDocument();
$dom->loadXml($xml);
$xpath = new DOMXpath($dom);
$xpath->registerNamespace('r', 'http://schemas.microsoft.com/package/2005/06/relationships');

var_dump(
  $xpath->evaluate("string(/r:Relationships/r:Relationship[@Id='rId2']/@Target)", NULL, FALSE)
);

Output:

string(24) "http://www.wikipedia.org"

Xpath does not know something like a default namespace. Without a prefix you look for elements without any namespace. Attributes don't have a namespace if not explicitly prefixed.

To make the confusion complete, do the PHP functions (SimpleXMLElement::xpath(), DOMXpath::query() and DOMXpath::evaluate()) automatically register the namespace definitions of the used context. The third argument allows to disable that behaviour.

Unlike the other two functions, DOMXpath::evaluate() can return scalars directly.

ThW
  • 19,120
  • 3
  • 22
  • 44
  • Despite my answer below I ended up using the DOM because I had not realised that you can't delete nodes in SimleXML. – Thomas Dec 03 '13 at 08:49
  • Also worth noting that if you drag the Relationship XML in to Firefox, the attribute of the first node is stripped out. – Thomas Dec 03 '13 at 08:50
  • It is not shown in die interactive display. But it is still here in the source. Maybe because the namespace nodes are separate from in DOM. If you work with an XML DOM in Javascript the namespace is in here, too. – ThW Dec 03 '13 at 09:00