Like indicated in the docs -
findall returns the complete list of elements matching the match
xpath , we can use subscripts to access them , example -
>>> root = ET.fromstring("<a><b>c</b></a>")
>>> root.findall("./b")
[<Element 'b' at 0x02048C90>]
>>> lst = root.findall("./b")
>>> lst[0]
<Element 'b' at 0x02048C90>
We can also use for loop to iterate through the list.
- iterfind returns an iterator (generator), it does not return the list , in this case we cannot use subscripts to access the element, we can only use it in places where iterators are accepted, an example would be in a for loop.
iterfind would be faster than findall in cases where you actually want to iterate through the returned list(which is most of the time from my experience) , since findall has to create the complete list before returning, whereas iterfind finds (yields) the next element that matches the match
only on iterating and call to next(iter)
(which is what is internally called when iterating through the list using for
or such constructs).
In cases where you want the list, Both seem to have similar timing.
Performance test for both cases -
In [1]: import xml.etree.ElementTree as ET
In [2]: x = ET.fromstring('<a><b>c</b><b>d</b><b>e</b></a>')
In [3]: def foo(root):
...: d = root.findall('./b')
...: for y in d:
...: pass
...:
In [4]: def foo1(root):
...: d = root.iterfind('./b')
...: for y in d:
...: pass
...:
In [5]: %timeit foo(x)
100000 loops, best of 3: 9.24 µs per loop
In [6]: %timeit foo1(x)
100000 loops, best of 3: 6.65 µs per loop
In [7]: def foo2(root):
...: return root.findall('./b')
...:
In [8]: def foo3(root):
...: return list(root.iterfind('./b'))
...:
In [9]: %timeit foo2(x)
100000 loops, best of 3: 8.54 µs per loop
In [10]: %timeit foo3(x)
100000 loops, best of 3: 8.4 µs per loop