3

I have the following regular XML file:

<root>
    <Items>
        <Item>
            <tag1>text1</tag1>
            <tag2>text2</tag2>
            <tag3>text3</tag3>
        </Item>
        <Item>
            <tag1>text1</tag1>
            <tag2>text4</tag2>
            <tag3>text5</tag3>
        </Item>
    </Items>
</root>

And I want to get all nodes (all <Item>) where <tag1> text equals to text1, and then print their all other tags for example <tag2>.

I started with this but struggling to find answers to the TODO'S:

try {
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    DocumentBuilder builder = factory.newDocumentBuilder();
    Document doc = builder.parse(("\URI\file.xml"));
    XPathFactory xPathfactory = XPathFactory.newInstance();
    XPath xpath = xPathfactory.newXPath();
    //TODO: Is this correct query?
    XPathExpression expr = xpath.compile("//root//Items//Item//tag1[contains(., 'text1')]");
    NodeList nl = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
    for (int i=0; i<nl.getLength(); i++) {
        //TODO: How to iterate over all matched <Item> and get their <tag2>, <tag3> etc.?
    }
} catch (Exception e) {e.printStackTrace();}

Thanks,

michael
  • 3,835
  • 14
  • 53
  • 90

1 Answers1

5

You can use Chrome Developer Tools $x("path") function for testing XPaths. It's really easy. It works on latest Firefox too.

I made an HTML file with your supplied text and opened it in Chrome. In console, type $x("/some-path") to test stuff.

  1. Get items:

    //Item
    
  2. Where tag1 text equals text1:

    //Item/tag1[.='text1']
    
  3. Get following sibling that is a tag2:

    //Item/tag1[.='text1']/following-sibling::tag2
    

If you want the Item and not the tag2:

//Item[ ./tag1[.='text1']/following-sibling::tag2 ]
Neil McGuigan
  • 46,580
  • 12
  • 123
  • 152
  • But I use Java and don't I need to put all path like //root//Items//Item// in order to make the search more efficient? thanks, – michael Nov 04 '15 at 22:08
  • and I don't want the following sibling, because in general I have more tags. I edited the question – michael Nov 04 '15 at 22:09
  • You can use the same XPathin java! If you are using XPath 2 at least. Worry about performance later. It probably won't make a difference. – Neil McGuigan Nov 04 '15 at 22:09
  • I have 180K `` nodes – michael Nov 04 '15 at 22:10
  • my code will return the following sibling that is a tag2 – Neil McGuigan Nov 04 '15 at 22:11
  • the problem is how in java get object containing all `` matched to query search and iterate over them to print their other tags. – michael Nov 04 '15 at 22:13
  • 1
    @michael, in that case, you need to obtain the Item node that contains tag1=text1. And then you can process the children of the node separately. – izce Nov 04 '15 at 22:16
  • @IzCe Thanks, I don't understood, can you please answer with code based on my skeleton code? My main problem is java syntax how to get it work... – michael Nov 04 '15 at 22:37
  • The XPath expression Neil provided lets you find the Item node in DOM tree. Then you will traverse the children nodes to do the actual processing. You may check the following [link](http://stackoverflow.com/questions/5386991/java-most-efficient-method-to-iterate-over-all-elements-in-a-org-w3c-dom-docume) – izce Nov 04 '15 at 22:46