4

Python noob here. Wondering what's the cleanest and best way to remove all the "profile" tags with updated attribute value of true.

I have tried the following code but it's throwing: SyntaxError("cannot use absolute path on element")

 root.remove(root.findall("//Profile[@updated='true']"))

XML:

<parent>
  <child type="First">
    <profile updated="true">
       <other> </other>
    </profile>
  </child>
  <child type="Second">
    <profile updated="true">
       <other> </other>
    </profile>
  </child>
  <child type="Third">
     <profile>
       <other> </other>
    </profile>
  </child>
</parent>
user1195192
  • 679
  • 3
  • 11
  • 19

1 Answers1

8

If you are using xml.etree.ElementTree, you should use remove() method to remove a node, but this requires you to have the parent node reference. Hence, the solution:

import xml.etree.ElementTree as ET

data = """
<parent>
  <child type="First">
    <profile updated="true">
       <other> </other>
    </profile>
  </child>
  <child type="Second">
    <profile updated="true">
       <other> </other>
    </profile>
  </child>
  <child type="Third">
     <profile>
       <other> </other>
    </profile>
  </child>
</parent>"""

root = ET.fromstring(data)
for child in root.findall("child"):
    for profile in child.findall(".//profile[@updated='true']"):
        child.remove(profile)

print(ET.tostring(root))

Prints:

<parent>
  <child type="First">
    </child>
  <child type="Second">
    </child>
  <child type="Third">
     <profile>
       <other> </other>
    </profile>
  </child>
</parent>

Note that with lxml.etree this would be a bit simpler:

root = ET.fromstring(data)
for profile in root.xpath(".//child/profile[@updated='true']"):
    profile.getparent().remove(profile)

where ET is:

import lxml.etree as ET
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
  • Thanks for the solution. I formatted my actual XML implementation and your code didnt remove the Profile tag on it. (Not your fault). I'll accept your answer and repost a new question. – user1195192 Sep 05 '16 at 22:37
  • @user1195192 don't worry - just update the XML and I'll update the code appropriately. – alecxe Sep 05 '16 at 22:38
  • thanks. I figured it out. Needed another for loop. lxml code looks so much cleaner. – user1195192 Sep 05 '16 at 23:05