I need to remove some parts of a XML file, for example this file:
<dict>
<key>Images</key>
<array>
<dict>
<key>ImageIndex</key>
<integer>0</integer>
<key>NumberOfROIs</key>
<integer>42</integer>
<key>ROIs</key>
<array>
<dict>
<key>Area</key>
<real>0.0</real>
<key>Center</key>
<string>(0.000000, 0.000000, 0.000000)</string>
<key>Dev</key>
<real>0.0</real>
<key>IndexInImage</key>
<integer>0</integer>
<key>Max</key>
<real>1358</real>
<key>Mean</key>
<real>1358</real>
<key>Min</key>
<real>1358</real>
<key>Name</key>
<string>Calcification</string>
<key>NumberOfPoints</key>
<integer>1</integer>
<key>Point_mm</key>
<array>
<string>(0.000000, 0.000000, 0.000000)</string>
</array>
<key>Point_px</key>
<array>
<string>(2964.620117, 3427.979980)</string>
</array>
<key>Total</key>
<real>1358</real>
<key>Type</key>
<integer>19</integer>
</dict>
<dict>
<key>Area</key>
<real>0.0</real>
<key>Center</key>
<string>(0.000000, 0.000000, 0.000000)</string>
<key>Dev</key>
<real>0.0</real>
<key>IndexInImage</key>
<integer>1</integer>
<key>Max</key>
<real>1401</real>
<key>Mean</key>
<real>1401</real>
<key>Min</key>
<real>1401</real>
<key>Name</key>
<string>Calcification</string>
<key>NumberOfPoints</key>
<integer>1</integer>
<key>Point_mm</key>
<array>
<string>(0.000000, 0.000000, 0.000000)</string>
</array>
<key>Point_px</key>
<array>
<string>(2993.159912, 3403.550049)</string>
</array>
<key>Total</key>
<real>1401</real>
<key>Type</key>
<integer>19</integer>
</dict>
<dict>
<key>Area</key>
<real>1.3665732145309448</real>
<key>Center</key>
<string>(0.000000, 0.000000, 0.000000)</string>
<key>Dev</key>
<real>66.487342834472656</real>
<key>IndexInImage</key>
<integer>36</integer>
<key>Max</key>
<real>1836</real>
<key>Mean</key>
<real>1583.29638671875</real>
<key>Min</key>
<real>1313</real>
<key>Name</key>
<string>Mass</string>
<key>NumberOfPoints</key>
<integer>89</integer>
<key>Point_mm</key>
<array>
<string>(0.000000, 0.000000, 0.000000)</string>
<string>(0.000000, 0.000000, 0.000000)</string>
</array>
<key>Point_px</key>
<array>
<string>(3196.290039, 1048.599976)</string>
<string>(3203.560059, 1046.170044)</string>
<string>(3211.330078, 1042.780029)</string>
<string>(3189.500000, 1050.540039)</string>
</array>
<key>Total</key>
<real>44457380</real>
<key>Type</key>
<integer>15</integer>
</dict>
</array>
</dict>
</array>
</dict>
</plist>
I want to remove everything between < dict > < /dict >, included, that have a < string > Calcification < /string > in it, in other words, I want only the parts that does not have Calcification, my desired result for this file would be:
<dict>
<key>Images</key>
<array>
<dict>
<key>ImageIndex</key>
<integer>0</integer>
<key>NumberOfROIs</key>
<integer>42</integer>
<key>ROIs</key>
<array>
<dict>
<key>Area</key>
<real>1.3665732145309448</real>
<key>Center</key>
<string>(0.000000, 0.000000, 0.000000)</string>
<key>Dev</key>
<real>66.487342834472656</real>
<key>IndexInImage</key>
<integer>36</integer>
<key>Max</key>
<real>1836</real>
<key>Mean</key>
<real>1583.29638671875</real>
<key>Min</key>
<real>1313</real>
<key>Name</key>
<string>Mass</string>
<key>NumberOfPoints</key>
<integer>89</integer>
<key>Point_mm</key>
<array>
<string>(0.000000, 0.000000, 0.000000)</string>
<string>(0.000000, 0.000000, 0.000000)</string>
</array>
<key>Point_px</key>
<array>
<string>(3196.290039, 1048.599976)</string>
<string>(3203.560059, 1046.170044)</string>
<string>(3211.330078, 1042.780029)</string>
<string>(3189.500000, 1050.540039)</string>
</array>
<key>Total</key>
<real>44457380</real>
<key>Type</key>
<integer>15</integer>
</dict>
</array>
</dict>
</array>
</dict>
</plist>
this is what I have tried:
data = r"C:\Users\vinc\Desktop\ExemploXML.xml"
import xml.etree.ElementTree as ET
tree = ET.parse(data)
root = tree.getroot()
for e in root.findall(".//string"):
if e.text == 'Calcification':
print(e)
root.remove(e)
else:
pass
tree.write(r'C:\Users\vinc\Desktop\out.xml')
Result ======================================
<Element 'string' at 0x000002B085002EA0>
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-43-d417d00038ed> in <module>
8
9 print(e)
---> 10 root.remove(e)
11 else:
12 pass
ValueError: list.remove(x): x not in list
For context, those XML files are semantic segmentation information, and I want to remove the Calcification class annotations.