2

I have a XML file like this:

$ cat sample.xml
<Requests>
        <Request>
                <ID>123</ID>
                <Items>
                        <Item>a item</Item>
                        <Item>b item</Item>
                        <Item>c item</Item>
                </Items>
        </Request>
        <Request>
                <ID>456</ID>
                <Items>
                        <Item>d item</Item>
                        <Item>e item</Item>
                </Items>
        </Request>
</Requests>

I simply want to extract the XML of Request elements which has certain value for their grandchild element Item. Here is code:

bash-4.2$ cat xsearch.py
import sys
import xml.etree.ElementTree as ET


if __name__ == '__main__':
        tree = ET.parse(sys.argv[1])
        root = tree.getroot()
        for request in root.findall(".//Item[.='c item']/../.."):
        #for request in root.findall(".//Request[Items/Item = 'c item']"):
                print(request)

I got "invalid predicate" error:

bash-4.2$ python3 xsearch.py sample.xml
Traceback (most recent call last):
  File "/usr/lib64/python3.6/xml/etree/ElementPath.py", line 263, in iterfind
    selector = _cache[cache_key]
KeyError: (".//Item[.='c item']/../..", None)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "xsearch.py", line 8, in <module>
    for request in root.findall(".//Item[.='c item']/../.."):
  File "/usr/lib64/python3.6/xml/etree/ElementPath.py", line 304, in findall
    return list(iterfind(elem, path, namespaces))
  File "/usr/lib64/python3.6/xml/etree/ElementPath.py", line 277, in iterfind
    selector.append(ops[token[0]](next, token))
  File "/usr/lib64/python3.6/xml/etree/ElementPath.py", line 233, in     prepare_predicate
    raise SyntaxError("invalid predicate")
SyntaxError: invalid predicate

 

Could any one point out where I got it wrong?

techie11
  • 1,243
  • 15
  • 30

1 Answers1

6

In general, an XPath invalid predicate error means something is syntactically wrong with one of the XPath's predicates, the code between the [ and ].

Specifically in your case, there are two issues:

  1. The SyntaxError("invalid predicate") is because there's an extra ) in the predicate:

     for request in root.findall(".//Item[.='c item')]/../.."):
                                                    ^
    

    Note also that you can hoist the predicate to avoid navigating down and then back up (../..):

    Instead of

     .//Item[.='c item']/../..
    

    consider

     .//Request[Items/Item = 'c item']
    

    to select the Request element with the targeted Item.

  2. The XPath library you're using, ElementTree, is not a full implementation of the XPath standard. You can waste a lot of time trying to identify what ElementTree does support (".//Items[Item='c item']/.." happens to work here) and does not support, but it'd be better to just use a more compliant library such as lxml.

kjhughes
  • 106,133
  • 27
  • 181
  • 240
  • Thanks for advice. But the same error perisists after I removed the extra ")". – techie11 Nov 27 '20 at 00:01
  • Without the extra `)`, the predicate is fine. Perhaps a new typo? Copy (don't retype) your exact code into the question again as an update and perhaps we can spot the new issue. – kjhughes Nov 27 '20 at 00:03
  • I re-copied the code and the output. Could you please review. I also tried the second method, the error is exactly the same. – techie11 Nov 27 '20 at 00:09
  • You can add to a question after it's been answered, but correcting code in the question to apply the solution suggested in the answer is really bad etiquette, it makes the thread impossible for a new reader to follow. – Michael Kay Nov 27 '20 at 08:10