0

I am parsing this XML document:

<?xml version='1.0' encoding='UTF-8'?>
<team xmlns='http://www.example.com/default' xmlns:ns1='http://www.example.com/ns1'>
    <ns1:coach ns1:coachAttr="ABC"/>
    <player playerAttr="XYZ"/>
</team>

I would expect player and playAttr to be in the http://www.example.com/default namespace, while coach and coachAttr would be in the http://www.example.com/ns1 namespace.

Turns out playerAttr has no namespace at all. Here is the code:

String xml="...";
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(new InputSource(new StringReader(xml)));

Element team = doc.getDocumentElement();
Element player = (Element) team.getChildNodes().item(...);
Element coach = (Element) team.getChildNodes().item(...);
Attr playerAttr = (Attr) player.getAttributes().item(...);
Attr coachAttr = (Attr) coach.getAttributes().item(...);

System.out.println("coach: Name=" + coach.getLocalName() + " NS=" + coach.getNamespaceURI());
System.out.println("coachAttr: Name=" + coachAttr.getLocalName() + " NS=" + coachAttr.getNamespaceURI());
System.out.println("player: Name=" + player.getLocalName() + " NS=" + player.getNamespaceURI());
System.out.println("playerAttr: Name=" + playerAttr.getLocalName() + " NS=" + playerAttr.getNamespaceURI());

This prints out 4 lines. The first 3 make sense to me. I do not understand the last line, where NS is null.

coach: Name=coach NS=http://www.example.com/ns1
coachAttr: Name=coachAttr NS=http://www.example.com/ns1
player: Name=player NS=http://www.example.com/default
playerAttr: Name=playerAttr NS=null

Why is playerAttr treated differently? Is this in some kind of spec? What does it even mean that an items has no namespace?

George
  • 2,436
  • 4
  • 15
  • 30

2 Answers2

1

Every XML parser behaves like this. Since your question is 'why', I would guess that this is simply an easier approach for most practical purposes.

More details here:

https://stackoverflow.com/a/3313538/80911

Evert
  • 93,428
  • 18
  • 118
  • 189
  • This doesn't make it easier to write robust code. In the above example I am getting child nodes by index for illustrative purposes. In real life I would use getAttributeNS(nsName, attrName). There is no way to know in advance that the XML string will use a namespace prefix for coachAttr and no namespace prefix for playerAttr. Namespace prefixes are not supposed to change the meaning of XML. – George Jun 08 '17 at 20:39
  • The easy answer is to not use getAttributeNS, but getAttribute. Assume that every non-namespaced attribute is really part of the namespace of it's parent element. – Evert Jun 08 '17 at 20:42
  • Do you mean that in the attribute abc belongs to namespace ns1, not to the default namespace? Is it in the spec? – George Jun 08 '17 at 20:47
  • See the link I posted, it does a better job explaining it than I do. – Evert Jun 08 '17 at 21:45
  • 1
    Thank you. The linked question has a good explanation. An unprefixed attribute does not belong to any namespace. This is different from an unprefixed element name, which belongs to the default namespace. Linked question has the relevant quote from the spec. – George Jun 10 '17 at 21:25
1

What exactly do you mean by "why?". Do you mean "where is this behaviour specified?", or "what rationale did the designers give for this decision?" or "what advantages are there in doing it this way?"

(None of these questions are a good fit for StackOverflow, by the way...)

In fact, there isn't universal consensus on this. DOM and XPath and most other APIs do it this way, but it's not mandated by the XML Namespaces spec itself, which says:

Default namespace declarations do not apply directly to attribute names; the interpretation of unprefixed attributes is determined by the element on which they appear.

I've heard lots of theories about what that was supposed to mean, but none of the theories translate particularly well into concrete APIs.

In practice you need to look at the specs for the API you are using, which in this case is DOM. (99% of the Java/XML users on StackOverflow appear to use DOM, which I find very depressing, because there are much better alternatives available, such as JDOM2 and XOM.)

Michael Kay
  • 156,231
  • 11
  • 92
  • 164