2

I believe this question might have been previously attempted in 2006 on a different site. But, my current XML/RDF writer (XML::LibXML 1.70) outputs element namespaces in the form of xmlns attributes. This will exclude people using non-namespace aware parsers who just do a look_down for foaf:Person. I'm wondering if anyone knows of an easy way in perl to achive this, firstly, with XML::LibXML. Or through a different means.

Nodes like this:

  <Person xmlns="http://xmlns.com/foaf/0.1/" rdf:ID="me"/>

And, this:

  <name xmlns="http://xmlns.com/foaf/0.1/">Evan Carroll</name>

Should really look like:

  <foaf:Person rdf:ID="me"/>
  <foaf:name>Evan Carroll</name>

Any ideas? I believe it is technically correct either way, but I'd much rather not depend on other people knowing this. I didn't know it myself yesterday.

Evan Carroll
  • 78,363
  • 46
  • 261
  • 468
  • I'm not even sure this is correct. I'm totally confused here. – Evan Carroll Mar 01 '10 at 23:48
  • I'm with Evan. You're wrong - you're trying to turn a valid FOAF document into something that isn't. People using non-namespace aware "XML parsers" is not your problem, and if it is, you shouldn't try to fix it by also breaking things at your end. – reinierpost Mar 02 '10 at 17:58
  • You're 'with' the Evan who asked the question or the Evan who made the first comment? I'm not sure who you think is 'wrong', but there are (sometimes) sensible reasons for tight control over namespace declarations. This can be achieved with XML::LibXML's DOM bindings. As far as I can see, nothing is 'broken' by exercising this control. – Andrew Walker Mar 02 '10 at 18:24
  • @reinierpost, this was my confusion: (a) I don't want to have to depend on XML-NS aware foaf bots. (b) I didn't know or visually like the redundant xmlns attributes, and (c) I didn't know that both `foaf:` and `xmlns=` could be excluded and still have a working foaf document. – Evan Carroll Mar 02 '10 at 19:24
  • I'm with commenter Evan, in that you should never change the namespaces on the elements, but your answer does exactly what questioner Evan wants. – reinierpost Mar 03 '10 at 09:48
  • There is no commenter Evan that was me, the questioner, stating that I was confused because I wasn't sure that `foo` was the same thing as `foo` and moreover, the same as `foo` – Evan Carroll Mar 03 '10 at 16:51

1 Answers1

5

The short answer is that if you already have a namespaceURI and prefix declared you can specify the qualified name (i.e. prefix:localName) as an element name and this will make XML::LibXML avoid redeclaring the namespace. So, modifying the code from the last question gives the following, which does use the desired namespace prefixes:

#! /usr/bin/perl 
use warnings;
use strict;
use XML::LibXML;
my $doc = XML::LibXML::Document->new( '1.0', 'UTF-8' );
my $foaf = $doc->createElementNS( 'http://www.w3.org/1999/02/22-rdf-syntax-ns#', 'RDF' );
$doc->setDocumentElement( $foaf );
$foaf->setNamespace( 'http://www.w3.org/1999/02/22-rdf-syntax-ns#' , 'rdf', 1 );
$foaf->setNamespace( 'http://www.w3.org/2000/01/rdf-schema#' , 'rdfs', 0 );
$foaf->setNamespace( 'http://xmlns.com/foaf/0.1/' , 'foaf', 0 );
$foaf->setNamespace( 'http://webns.net/mvcb/' , 'admin', 0 );
my $node = $doc->createElementNS( 'http://xmlns.com/foaf/0.1/', 'foaf:Person');
$foaf->appendChild($node);
$node->setAttributeNS( 'http://www.w3.org/1999/02/22-rdf-syntax-ns#', 'ID', 'me');
my $node2 = $doc->createElementNS( 'http://xmlns.com/foaf/0.1/', 'foaf:name');
$node2->appendTextNode('Evan Carroll');
$node->appendChild($node2);
print $doc->toString;

It is perhaps worth trying to review what's going on though. XML Namespaces exist to allow multiple vocabularies to be used together, in the same XML document. To achieve this the concept of a namespaceURI (nsURI) is introduced and a mechanism of indicating which nsURI relates to which elements and attributes in an XML document is retrofitted onto XML. To do this use is made of the fact that attribute names starting 'xml' are reserved allowing a special attribute name (xmlns) to be used without the risk of collision.

The general idea is that it is possible to link each vocabulary used in an XML document with a unique nsURI (which is treated as an opaque string). The head element in the XHTML vocabulary is fully defined by {'http://www.w3.org/1999/xhtml':'head'}, and this is clearly different from the head in a (hypothetical) anatomy-ML {'my-made-up-URI':'head'}. The issue is how to embed the nsURI(s) in an XML document and how to link these to the element names.

One way to make the link between a nsURI and an element name is to add the xmlns attribute to the element. For example:

<name xmlns="http://xmlns.com/foaf/0.1/">Evan Carroll</name>

says that 'name' is in the 'http://xmlns.com/foaf/0.1/' namespace. Namespace declarations are inherited by children, so 'age' is in the same namespace:

<name xmlns="http://xmlns.com/foaf/0.1/">Evan Carroll<age years='21'/></name>

This can work well and be quite compact. However, it doesn't work for attributes and can get messy if lots of sibling nodes need to change namespace from their common parent. To deal with both of these problem the NamespacePrefix (nsPrefix) is introduced. This gives the colon special meaning. The idea is to link the nsURI to a string that is used in the current document. This doesn't have any special meaning outside the document and shouldn't be specified by the vocabulary (but it sometimes is, discussion for elsewhere). It's particularly common for all nsURI's to be declared on the root element. The syntax is to declare the namespace thus:

xmlns:prefix="http://xmlns.com/foaf/0.1/"

and use it in attribute and element names by prepending the nsPrefix to the name:

<prefix:name prefix:attribute='value'/>

Because the exact value of nsPrefixes are not supposed to matter, API's generally don't make accessing / setting them very easy (Xpath is a good example). Having namespaces leads to some constraints on the document that should be treated as errors, using a prefix that isn't defined is an example. But such a document can be well-formed according to the XML specification (remember namespaces are retrofitted). You can describe such a document as 'not being namespace well-formed'.

Parsing a document that uses namespaces with a parser that dosn't know anything about namespaces is obviously easier if you know the namespace prefixes used in advance. But this is quite a brittle solution as namespace prefixes can change in odd places as an XML document is repeatedly processed. Most parsers are namespace aware.

Andrew Walker
  • 2,451
  • 2
  • 18
  • 15