0

I am trying to iterate over my xml file. The root element contains xml name spaces, that prevent me to iterate over ClsItems tags. My XML:

<?xml version="1.0" encoding="utf-8"?>

<ArrayOfClsItems xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://ws.comax.co.il/Comax_WebServices/">

    <ClsItems>

        <Name>Item 0</Name>

        <Size />

        <ID>1</ID>

    </ClsItems>
    
    <ClsItems>

        <Name>Item 1</Name>

        <Size />

        <ID>1</ID>
    
    </ClsItems>
    ...

</ArrayOfClsItems>

My parsing iteration attempt:

import xml.etree.ElementTree as ET
tree = ET.parse(xml_path)
root = tree.getroot()
headers = []
product_dict = { "Name": [], "Size": [], 'Id': []}

for idx, item in enumerate(root.findall('.//ClsItems')):
   if idx == 0:
        headers = [x.tag for x in list(item)]
   for h in headers:
        if h in product_dict:
             product_dict[h].append(item.find(h).text)

When I tried to delete the xml name spaces:

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xmlns:xsd="http://www.w3.org/2001/XMLSchema" 
xmlns="http://ws.comax.co.il/Comax_WebServices/" 

The code worked perfectly.

Any ideas on how to define the root so I can iterate with the namespaces?

gil
  • 1
  • 1
  • 1
    XML namespaces is a common stumbling block. See https://docs.python.org/3/library/xml.etree.elementtree.html#parsing-xml-with-namespaces. Many similar questions have already been asked, for example https://stackoverflow.com/q/20435500/407651. See also https://stackoverflow.com/a/62117710/407651. – mzjn Sep 30 '21 at 11:05
  • 1) Post a **VALID** xml doc. 2) What is the information you need to extract from the xml? – balderman Sep 30 '21 at 11:09

0 Answers0