1

I have some XML that roughly looks like this:

<project type="mankind">
    <suggestion>Build the enterprise</suggestion>
    <suggestion>Learn Esperanto</suggestion>
    <problem>Solve world hunger</suggestion>
    <discussion>Do Vulcans exist</discussion>
</project>

I want to use XPath to find out the names of the second level elements (there can be elements I won't know upfront) using Java. This is the code I tried:

public NodeList xpath2NodeList(Document doc, String xPathString) throws XPathExpressionException {
     XPath xpath = XPathFactory.newInstance().newXPath();
     MagicNamespaceContext nsc = new MagicNamespaceContext();
     xpath.setNamespaceContext(nsc);
     Object exprResult = xpath.evaluate(xPathString, doc, XPathConstants.NODESET);
     return (NodeList) exprResult;
}

My XPath is /project/*/name(). I get the error:

javax.xml.transform.TransformerException: Unknown nodetype: name

A query like /project/suggestion works as expected. What am I missing? I'd like to get a list with the tag names.

Mykola Yashchenko
  • 5,103
  • 3
  • 39
  • 48
stwissel
  • 20,110
  • 6
  • 54
  • 101

2 Answers2

3

Java6 (don't ask).

I think your implementation only supports XPath 1.0. If that were true, only the following would work:

"name(/project/*)"

The reason for this is that in the XPath 1.0 model, you cannot use functions (like name()) as a step in a path expression. Your code throws an exception and in this case, the processor mistakes your function name() for an unknown node test (like comment()). But there is nothing wrong with using a path expression as the argument of the name() function.

Unfortunately, if an XPath 1.0 function that can only handle a single node as an argument is given a sequence of nodes, only the first one is used. Therefore, it is likely that you will only get the first element name as a result.

XPath 1.0's capability to manipulate is very limited and often the easiest way to get around such problems is to reach for the higher-level language that uses XPath as the query language (in your case Java). Or put another way: Write an XPath expression to retrieve all relevant nodes and iterate over the result, returning the element names, in Java.

With XPath 2.0, your inital query would be fine. Also see this related question.

Community
  • 1
  • 1
Mathias Müller
  • 22,203
  • 13
  • 58
  • 75
  • Much better, now I get a org.apache.xpath.XPathException: Can not convert #STRING to a NodeList!, But I guess that's due to my XPathConstants.NODESET. When I try it with XPathConstants.STRING it works. Now.... is there a way to figure out if it is a String only or a nodeset coming back. If I omit the parameter it is always a String coming back – stwissel Aug 29 '14 at 15:20
  • @stwissel I'm not familiar with the Java Transformer, but is it necessary that the method that evaluates your path expression returns a `NodeList`? Can you write different methods, one for finding nodes, and the other for strings? – Mathias Müller Aug 29 '14 at 15:28
  • That's the plan B. Of course I then must guess which of the XPath expressions (I read them from a config file) would return a String and which one a nodeset (or I just try and catch the error). – stwissel Aug 29 '14 at 15:32
  • @stwissel As I said, as much as I would like to help with Java, that's not exactly my field of expertise. I find it cumbersome to have to determine the return type in advance and I'm sure there's a better way. How about having just one method to evaluate XPath, only writing expressions that evaluate to a node set, return the nodes and pass them to another method that takes a node set of elements as arguments and returns their names? – Mathias Müller Aug 29 '14 at 15:44
  • I won't write the XPath, so I have no control. I expect 90% to be Nodesets, so I'll just put an error handler around it and if it fails try again using text. Not sexy, but the XML isn't very big, so we talk millisecond delays. Thx for the pointer with the syntax. Appreciate the swift reply – stwissel Aug 29 '14 at 15:48
1

Below code may answer your original question.

    package com.example.xpath;

    import java.io.File;
    import java.io.FileInputStream;
    import java.io.FileNotFoundException;
    import java.io.IOException;

    import javax.xml.parsers.DocumentBuilder;
    import javax.xml.parsers.DocumentBuilderFactory;
    import javax.xml.parsers.ParserConfigurationException;
    import javax.xml.xpath.XPath;
    import javax.xml.xpath.XPathConstants;
    import javax.xml.xpath.XPathExpression;
    import javax.xml.xpath.XPathExpressionException;
    import javax.xml.xpath.XPathFactory;

    import org.w3c.dom.Document;
    import org.w3c.dom.Node;
    import org.w3c.dom.NodeList;
    import org.xml.sax.SAXException;

    public class XPathReader {

        static XPath xPath =  XPathFactory.newInstance().newXPath();

        public static void main(String[] args) {

            try {
                FileInputStream file = new FileInputStream(new File("c:/mankind.xml"));

                DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();

                DocumentBuilder builder =  builderFactory.newDocumentBuilder();

                Document xmlDocument = builder.parse(file);

                XPathExpression expr = xPath.compile("//project/*");
                NodeList list= (NodeList) expr.evaluate(xmlDocument, XPathConstants.NODESET);
                for (int i = 0; i < list.getLength(); i++) {
                    Node node = list.item(i);
                    System.out.println(node.getNodeName() + "=" + node.getTextContent());
                }

            } catch (FileNotFoundException e) {
                e.printStackTrace();
            } catch (SAXException e) {
                e.printStackTrace();
            } catch (IOException e) {
                e.printStackTrace();
            } catch (ParserConfigurationException e) {
                e.printStackTrace();
            } catch (XPathExpressionException e) {
                e.printStackTrace();
            }        
        }
    }

Assuming the input is corrected (Solve world hunger), the code should print:

suggestion=Build the enterprise
suggestion=Learn Esperanto
problem=Solve world hunger
discussion=Do Vulcans exist

  • Thx for the code. Challenge here: the XPath returns the whole element, not just the name. I have the need to be able to distinguish between getting the name or the content of an element based on the XPath. With a Nodeset returned the post processing step needs to do that, but that needs additional 'knowledge' besides the XPath expression – stwissel Sep 03 '14 at 02:40
  • On the println the code shows how to just retrieve the name; i.e. node.getNodeName(). – westcoastmilt Sep 03 '14 at 15:58
  • Sorry for not being clear. I appreciate your help. The XPath expressions will be pulled from an external source. When I read "/project/*" (one slash should be sufficient, since it is on the top level) I get a node set back. When I get it back I would not know if the XPath was meant to retrieve the node name or the node value. So the XPath would be ambiguous. Using Java classes to retrieve the name is nicely shown in your example, but I was looking for the XPath directly returning them. Thx for helping out here – stwissel Sep 03 '14 at 16:08
  • I don't see how NodeList can only send back name or value based on the XPath expression. Either it needs to return a NodeList when the xPathString = "/project/*" or String when the xPathString = "name(/project/*[3])". One option is for the method signature to be 'public List xpath2NodeList(Document doc, String xPathString) throws XPathExpressionException'. Vary the XPathConstants to NODESET or STRING, depending if the xPathString starts with 'name('. When using NODESET place all the retrieved values in a List. When using String only return the element name in List index 0. – westcoastmilt Sep 05 '14 at 02:00
  • That was exactly my problem. The best solution is to have a later xpath engine where /name() is working. name(somthing) returns a concatenated string if something is a nodeset. So some error handling required as you nicely pointed out – stwissel Sep 05 '14 at 02:13