I have the following XML:
<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:html="http://www.w3.org/TR/REC-html40">
<Names>
<NamedRange ss:Name="SomeNamedRange" ss:RefersTo="=Control!R1C1:R51C4"/>
</Names>
<Worksheet ss:Name="Control" ss:Protected="1">
<Table ss:ExpandedColumnCount="4" ss:ExpandedRowCount="51">
<Row>
<Cell ss:StyleID="s145">
<Comment ss:Author="Some comment here">
<ss:Data xmlns="http://www.w3.org/TR/REC-html40"></ss:Data>
</Comment>
</Cell>
</Row>
</Table>
</Worksheet>
</Workbook>
I would like to get the Names
element with XPath, so I try:
//Names
but this doesn't work. So far, I have found a number of ways to fix this.
//ss:Names
//*:Names
//*[local-name()='Names']
OR, I can delete the following element:
<ss:Data xmlns="http://www.w3.org/TR/REC-html40"></ss:Data>
So clearly, this is something to do with namespaces but I still don't really understand what's going on. So I have two questions:
- Why does deleting the
ss:Data
element affect being able to read theNames
element? - Given that there are 5 namespaces declared at the top, why is the
Names
element considered to be in thess
namespace (when thess:Data
element exists)? - What is the correct general approach here? I feel like there is some general piece of information I'm missing about either XML or XPath
EDIT:
This issue is not limited to http://xpather.com/. I have had various results with different XPath websites, and have summarised the results here.