Find element by text with XPath in ElementTree

Question

Given an XML like the following:

<root>
    <element>A</element>
    <element>B</element>
</root>

How can I match the element with content A using ElementTree and its support for XPath? Thanks

score 38 · Accepted Answer · answered May 31 '12 at 15:12

38

AFAIK ElementTree does not support XPath. Has it changed?

Anyway, you can use lxml and the following XPath expression:

import lxml.etree
doc = lxml.etree.parse('t.xml')
print doc.xpath('//element[text()="A"]')[0].text
print doc.xpath('//element[text()="A"]')[0].tag

The result will be:

A
element

answered May 31 '12 at 15:12

brandizzi

26,083
8
103
158

2

ElementTree supports xpath, see my answer below. – neves Apr 13 '18 at 18:56

neves · Answer 2 · 2022-03-03T20:32:35.033

12

You can use a subset of XPath in ElementTree. It isn't necessary to install any lib.

config.findall('.//*[element="A"]/element')

As the comment bellow from @Bionicegenius explains, the expression above just works if your element has no sibilings, but you get the idea.

It is possible to use XPath in ElementTree, and it is the simplest solution.

edited Mar 03 '22 at 20:32

answered Nov 07 '17 at 23:18

neves

33,186
27
159
192

2

This has a problem of selecting all the elements on the same level as the desired node. This will find both elements with values A and B. If you modify it to find, then it will only ever find the element with a value of A, even if you search for B - it will only return the first child. – Bioniclegenius Nov 30 '17 at 16:22
2

Searching by text is a recent addition, the syntax is `.//element[.="A"]` . The above answer wouldn't work. ElementTree only support a very limited subset of XPath. – dwery Mar 01 '22 at 19:00
1

@dwery - Great, thanks! Once I knew the syntax to look for, I found it in the python docs - it says it was added in Python 3.7. It's been there long enough for me! – ArtOfWarfare Apr 14 '22 at 21:50

score 11 · Answer 3 · answered May 31 '12 at 15:58

If you want to use the standard library ElementTree, rather than lxml, you can use iteration to find all sub elements with a particular text value. For example:

import sys
import xml.etree.ElementTree as etree

s = """<root>
    <element>A</element>
    <element>B</element>
</root>"""

e = etree.fromstring(s)

if sys.version_info < (2, 7):
    found = [element for element in e.getiterator() if element.text == 'A']
else:
    found = [element for element in e.iter() if element.text == 'A']

print found[0].text # This prints 'A', honestly!

Note: you may want to perform some stripping of the text value of your elements in the list comprehension.

Edit This will work to any depth in your XML tree. For example,

s = """<root>
    <element>A</element>
    <element><sub>A</sub></element>
</root>"""

found = [element for element in e.getiterator() if element.text == 'A']

for f in found:
    print f

will print

<Element element at 7f20a882e3f8>
<Element sub at 7f20a882e4d0>

I have chosen to use cElementTree over lxml since for my task it has a much lower memory overhead (memory being more critical than CPU usage), this was the final bit of code I needed to move from lxml — Patrick, Sep 09 '13 at 03:26

Find element by text with XPath in ElementTree

3 Answers3

Linked