As per this thread, I am using xml.dom.minidom
to do some very basic XML traversing, read-only.
What confuses me is why its getElementsByTagName
is finding nodes several hierarchy levels deep without explicitly supplying it with their exact path.
XML:
<data>
<items>
<item name="item1"></item>
<item name="item2"></item>
<item name="item3"></item>
<item name="item4"></item>
</items>
<secondSetOfItems>
<item name="item5"></item>
<item name="item6"></item>
<item name="item7"></item>
<item name="item8"></item>
</secondSetOfItems>
</data>
Python code:
xmldoc = minidom.parse('sampleXML.xml')
items = xmldoc.getElementsByTagName('item')
for item in items:
print item.attributes['name'].value
Prints:
item1
item2
item3
item4
item5
item6
item7
item8
What bothers me is that it implicitly finds tags named item
under both data->items
as well as data->secondSetOfItems
.
How do I make it follow an explicit path and only extract items under one of the two categories? E.g. under data->secondSetOfItems
:
item5
item6
item7
item8