2

How do I read/parse an XML document where the XML namespace alias is unknown? The structure and namespaces of the XML document are known, but the alias is not. E.g.

<?xml version="1.0" encoding="utf-8"?>
<Order xmlns:aa="urn:namespace1"
       xmlns:bb="urn:namespace2">
  <aa:Quantity>1</aa:Quantity>
  <bb:Price>9.98</bb:Price>
</Order>

Or

<?xml version="1.0" encoding="utf-8"?>
<Order xmlns:cc="urn:namespace1"
       xmlns:dd="urn:namespace2">
  <cc:Quantity>1</cc:Quantity>
  <dd:Price>9.98</dd:Price>
</Order>

Update: I don't know the XML namespace aliases up front. They can be whatever.

I need to supply the XmlNamespaceManager with a list of namespaces and alias with the AddNameSpace method like so:

XPathDocument xDoc = new XPathDocument(“Path to my file”); 
XPathNavigator xNav = xDoc.CreateNavigator();
XmlNamespaceManager xmlns = new XmlNamespaceManager(xNav.NameTable);
xmlns.AddNamespace("aa", "urn:namespace1");
xmlns.AddNamespace("bb", "urn:namespace2");

But this is not XML namespace agnostics. My second document uses cc and dd as alias for the same namespace.

Lybecker
  • 618
  • 5
  • 16
  • The problem is, of course, that namespaces aer there for a good reason so you usually have to be aware of them when parsing xml. However, some people often just opt to strip the namespaces out - see http://stackoverflow.com/questions/987135/how-to-remove-all-namespaces-from-xml-with-c for example. – dash Oct 15 '12 at 09:40
  • The problem seems to be that AddNamespace() requires the prefix. I assume `"urn:namespace1"` is literally the same in both cases. With XNamespace this is trivially solved. Can you use XDocument? – H H Oct 15 '12 at 09:53
  • "urn:namespace1" is the same instance in both cases. Yes - I can use XDocument, but my documents can get very large – Lybecker Oct 15 '12 at 10:06
  • 2
    A misconcepting here seems to be that the namespace prefix used in the C# code should match the namespace prefix used in the XML. That is not necessary. – Martin Liversage Oct 15 '12 at 10:29

2 Answers2

3

The code you have provided is namespace agnostic in the sense that the namespace prefixes used in the source XML does not matter. Given the namespace definitions in your question you have to use the prefixes defined by you in the XPATH, e.g. you have to use aa and bb.

var quantity = xNav.SelectSingleNode("/Order/aa:Quantity", xmlns);

However, this code will still successfully select from the XML where prefixes cc and dd are used as long as the namespaces urn:namespace1 and urn:namespace2 are correctly used.

To be able to include namespace prefixes in the XPATH you have to use the overloads that accepts an IXmlNamespaceResolver.

To reiterate: When you define a namespace using the following code

xmlns.AddNamespace("aa", "urn:namespace1");

You state that in your code (e.g. in the XPATH you intend to use) you will be using namespace prefix aa for namespace urn:namespace1.

In the XML you want to parse you assign namespaces using an attribute:

xmlns:cc="urn:namespace1"

It is important that the string urn:namespace1 matches both places to use that particular namespace. The prefixes are local to your code and the XML file respectively and they do not have to match.

Martin Liversage
  • 104,481
  • 22
  • 209
  • 256
  • In your case the XPath queries need to be aware of the XML namespace alias e.g. /Order/aa:Quantity or /Order/cc:Quantity I don't consider that XML namespace agnostic. – Lybecker Oct 15 '12 at 10:08
  • @Lybecker: You assigned the alias `aa` to `urn:namespace1` in the code. That is your decision and you can use any prefix in your code as long as it is unique. You could use `zz` if you wanted to and if the XML you are processing is using `aa` or `cc` even the default empty prefix for `urn:namespace1` your XPATH would still select what you expected. That is the beauty of namespace prefixes. They are local aliases for the real namespace. – Martin Liversage Oct 15 '12 at 10:11
0

The namespace aliases used in the document don't matter - they are just the aliases that are used in the document and can be whatever the author of that document wanted to use when authoring that document (they can even change mid-document).

To access this document in a alias-agnostic way just provide whatever alias you want to use to the XmlNamespaceManager and then use that alias to access the document, for example

XmlNamespaceManager xmlns = new XmlNamespaceManager(xNav.NameTable);
xmlns.AddNamespace("foo", "urn:namespace1");
xmlns.AddNamespace("bar", "urn:namespace2");

These aliases don't need to match the ones used in the document - this then allows you to use XPath expressions using the foo and bar aliases for those namespaces to navigate the document regardless of the aliases used in the document itself (as long as you supply that instance of XmlNamespaceManager).

Justin
  • 84,773
  • 49
  • 224
  • 367