13

I want to use JDOM to read in an XML file, then use XPath to extract data from the JDOM Document. It creates the Document object fine, but when I use XPath to query the Document for a List of elements, I get nothing.

My XML document has a default namespace defined in the root element. The funny thing is, when I remove the default namespace, it successfully runs the XPath query and returns the elements I want. What else must I do to get my XPath query to return results?

XML:

<?xml version="1.0" encoding="UTF-8"?>
<collection xmlns="http://www.foo.com">
<dvd id="A">
  <title>Lord of the Rings: The Fellowship of the Ring</title>
  <length>178</length>
  <actor>Ian Holm</actor>
  <actor>Elijah Wood</actor>
  <actor>Ian McKellen</actor>
</dvd>
<dvd id="B">
  <title>The Matrix</title>
  <length>136</length>
  <actor>Keanu Reeves</actor>
  <actor>Laurence Fishburne</actor>
</dvd>
</collection>

Java:

public static void main(String args[]) throws Exception {
    SAXBuilder builder = new SAXBuilder();
    Document d = builder.build("xpath.xml");
    XPath xpath = XPath.newInstance("collection/dvd");
    xpath.addNamespace(d.getRootElement().getNamespace());
    System.out.println(xpath.selectNodes(d));
}
Michael
  • 34,873
  • 17
  • 75
  • 109

3 Answers3

26

XPath 1.0 doesn't support the concept of a default namespace (XPath 2.0 does). Any unprefixed tag is always assumed to be part of the no-name namespace.

When using XPath 1.0 you need something like this:

public static void main(String args[]) throws Exception {
    SAXBuilder builder = new SAXBuilder();
    Document d = builder.build("xpath.xml");
    XPath xpath = XPath.newInstance("x:collection/x:dvd");
    xpath.addNamespace("x", d.getRootElement().getNamespaceURI());
    System.out.println(xpath.selectNodes(d));
}
Verhagen
  • 3,885
  • 26
  • 36
AnthonyWJones
  • 187,081
  • 35
  • 232
  • 306
7

I had a similiar problem, but mine was that I had a mixture of XML inputs, some of which had a namespace defined and others that didn't. To simplify my problem I ran the following JDOM snippet after loading the document.

for (Element el : doc.getRootElement().getDescendants(new ElementFilter())) {
    if (el.getNamespace() != null) el.setNamespace(null);
}

After removing all the namespaces I was able to use simple getChild("elname") style navigation or simple XPath queries.

I wouldn't recommend this technique as a general solution, but in my case it was definitely useful.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Michael Rutherfurd
  • 13,815
  • 5
  • 29
  • 40
  • Thanks for the suggestion. I thought about doing something like this, but like you sort of alluded to, removing the namespaces means that there's the chance that you'll run into name collisions, depending on what your XML data looks like. – Michael Feb 13 '09 at 19:24
1

You can also do the following

/*[local-name() = 'collection']/*[local-name() = 'dvd']/

Here is list of useful xpath queries.

Jerome Anthony
  • 7,823
  • 2
  • 40
  • 31