0

I have XML file like

    <Parent>
      <child1 key= "">
        <sub children>
      </child1>
      <child2 key="">
         <sub children>
      </child2>
   </parent>

In this XML file I would like to get all nodes which have attribute 'key'. How to achieve this using best Java XML Parser? I tried with StAX parser but it has to check every element to check whether it has attribute 'key' or not. So, it takes time to give output in case of large files.

Techie
  • 759
  • 4
  • 11
  • 29
  • According to that post StAX seems to be the more efficient way to parse an XML file (https://stackoverflow.com/a/373935/6138873) – jeanr Aug 11 '17 at 10:21
  • no need for stax. use Dom and XPath - you will get directly the good ones. – guillaume girod-vitouchkina Aug 11 '17 at 10:32
  • @jeanr StAX is efficient but I am looking for more efficient one for my requirement. – Techie Aug 11 '17 at 10:34
  • how large are your files ? . DOM can handle 100 MB or more and is very fast. whatever solution you take, you need every element to see if your key is here. Or you can try to scan yourself (perhaps faster, but not sure). – guillaume girod-vitouchkina Aug 11 '17 at 10:39
  • @guillaumegirod-vitouchkina Dom is not efficient because it keeps the file content in memory. So, I am trying with XPath and Dom4j but I am very new to these parsers. I am trying to find out the proper APIs. – Techie Aug 11 '17 at 10:41

1 Answers1

0

xpath for nodes with key (empty or not):

expression="//*[@key]";

or, for didactic purpose: empty (@key='') or not empty (string(@key))

expression="//*[(@key='')or(string(@key))]";

To parse with DOM, there are many examples abroad.

standard code:

DocumentBuilderFactory builderFactory =DocumentBuilderFactory.newInstance();
    DocumentBuilder builder = builderFactory.newDocumentBuilder();
    Document document = builder.parse(new InputSource(new StringReader(xml)));

    XPath xpath = XPathFactory.newInstance().newXPath();
    String expression="//*[(@key='')or(string(@key))]"; 

    Set<String> towns=new HashSet<String>();

    XPathExpression expr = xpath.compile(expression) ; 
    NodeList nodes  = (NodeList) expr.evaluate(document, XPathConstants.NODESET);