1

How can I find the elements with different attributes with lxml on python ?

for example

<Form>
    <Subform ind="0">
        <Check ind="0">0</Check>
        <Check ind="1">1</Check>
        <Check ind="2">2</Check>
        <Check ind="3">3</Check>
    </Subform>
</Form>

to retrieve the Checks I do:

tree.findall("./Form/Subform/Check")

to get the first:

tree.findall("./Form/Subform/Check[@ind='0']")

but what I want to do is something like

tree.findall("./Form/Subform/Check[@ind='0' or @ind='1']")

To retrieve the first and second only (or first and last)

How can I do that with lxml ?

maazza
  • 7,016
  • 15
  • 63
  • 96

2 Answers2

1

tree.findall("./Form/Subform/Check[@ind='0' or @ind='1']")

The expression is valid and this will work in lxml with xpath() method. If you want to make it "scalable", you can dynamically construct the expression:

values = ["0", "1"]
condition = " or ".join("@ind = '%s'" % value for value in values)
print(root.xpath("//Subform/Check[%s]" % condition))
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
1

That expression is correct but you need to use xpath() method which provide full XPath 1.0 support. findall() only support limited subset of XPath as xml.etree.ElementTree does :

tree.xpath("/Form/Subform/Check[@ind='0' or @ind='1']")
har07
  • 88,338
  • 12
  • 84
  • 137