2

I am parsing an XML file using xml.etree.ElementTree. I want to find elements based off the name attribute.

fnd = root.findall("./player[@name='Pqp239']")

But, this only will find exact matches for the name. How would you go about finding elements whose name contains a part of a name? So it would be something like this

fnd = root.findall("./player[@name='rob']")

would find all elemnts whose name contain rob like these:

Rob
Robert
chifforobe
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
Pqp239
  • 35
  • 1
  • 3

1 Answers1

2

You can use the contains() function, but this would only work if you would switch to lxml.etree instead of xml.etree.ElementTree which has only partial/limited XPath support:

import lxml.etree as ET

tree = ET.parse("input.xml")
root = tree.getroot()

root.xpath("./player[contains(@name, 'rob')]")

Note though, that to make the partial match case-insensitive you would need to additionally apply the translate() function:

Community
  • 1
  • 1
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
  • With translate(), instead of typing the whole alphabet, would be possible to use [a-z] and [A-Z]? – Pqp239 May 11 '16 at 23:43
  • @Pqp239 nope, but it's not that the english alphabet changes frequently :) – alecxe May 11 '16 at 23:44
  • I thought I was missing out on some new findall capabilities ;) – Padraic Cunningham May 12 '16 at 00:01
  • I installed lxml just now and am getting `ValueError: Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration.` How would I fix this issue? This happens when I parse it. – Pqp239 May 12 '16 at 00:12
  • @Pqp239 okay, please see http://stackoverflow.com/questions/15830421/xml-unicode-strings-with-encoding-declaration-are-not-supported. – alecxe May 12 '16 at 00:24