0

Let's assume that I have an XML like this:

<Rules>
       <Set Parameter="4" To="90">
             <If Parameter="1087" EqualsTo="90" />
        </Set>
        <Set Parameter="5" To="-5">
             <If Parameter="1087" EqualsTo="87" />
        </Set>
        <Set Parameter="6" To="[-5,23;36,7;58,7;78,8;94,47]">
             <If Parameter="1087" EqualsTo="87" />
         </Set>
         <Set Parameter="14" To="7,5" />
         <Set Parameter="15" To="-7,5" />
         <Set Parameter="16" To="0,5" />
         <Set Parameter="17" To="3" />
         <Set Parameter="18" To="-3" />
             <If Parameter="1087" EqualsTo="87" />
         </Set>
 </Rules> 

I would like to read this XML file and convert it to a pandas DataFrame:

Parameter<Set>       Parameter<If>
4                     1087
5                     1087
6                     1087
14                    1087
15                    1087
16                    1087
17                    1087
18                    1087

This is what I already tried, but I am getting some errors and probably there is a more efficient way of doing this task:

import xml.etree.ElementTree as ET
import pandas as pd
import os

def getMetrics(file_name):
    path="C:\\Users\Z003Z9CF\Downloads"
    os.chdir(path)
    tree = ET.parse('sample1.xml')
    print(tree)
    root = tree.getroot()
    print(root.tag)
    result = []
    for setnode in root.iter('Set'):                         
        node = setnode.attrib["Parameter"]  
        for ifnode in setnode:                              
        if "Parameter" in ifnode.attrib:
            result.append(dict(node=node, parameter=ifnode.attrib.get("Parameter")))
                    return result 

df = pd.DataFrame(getMetrics('sample1.xml'), columns["Parameter","Parameter"])          
print(df)
Pankaj
  • 931
  • 8
  • 15

1 Answers1

0

First of all you calling method and function return is wrong. You defined the function which accept only one parameter and you sending two. Also shown output as per your xml is also wrong.

The output should be as per your xml:

4                     1087
5                     1087
6                     1087
18                    1087

Here I debugged and rectified your function mistake on my local setup which is working fine.

import xml.etree.ElementTree as ET
import pandas as pd


def getMetrics(file_name):
    tree = ET.parse(file_name)
    root = tree.getroot()
    result = []
    for setnode in root.iter('Set'):
        node = setnode.attrib["Parameter"]
        for ifnode in setnode:
            if "Parameter" in ifnode.attrib:
                result.append(dict(node=node, parameter=ifnode.attrib.get("Parameter")))

    return result


df = pd.DataFrame(getMetrics('sample.xml'))
print(df)

Your xml file, should be look like this:

<?xml version="1.0" encoding="UTF-8"?>
<Rules>
    <Set Parameter="4" To="90">
         <If Parameter="1087" EqualsTo="90" />
    </Set>
    <Set Parameter="5" To="-5">
         <If Parameter="1087" EqualsTo="87" />
    </Set>
    <Set Parameter="6" To="[-5,23;36,7;58,7;78,8;94,47]">
         <If Parameter="1087" EqualsTo="87" />
     </Set>
     <Set Parameter="14" To="7,5" />
     <Set Parameter="15" To="-7,5" />
     <Set Parameter="16" To="0,5" />
     <Set Parameter="17" To="3" />
     <Set Parameter="18" To="-3" >
         <If Parameter="1087" EqualsTo="87" />
     </Set>
</Rules>
Pankaj
  • 931
  • 8
  • 15
  • @praveen does this answer helped you to solve your issue? – Pankaj Feb 07 '19 at 06:46
  • it's working with above xml file... if take large file i'll get error like this.File "C:\ProgramData\Anaconda3\lib\xml\etree\ElementTree.py", line 598, in parse self._root = parser._parse_whole(source) File "", line unknown ParseError: XML or text declaration not at start of entity: line 406, column 0 – Praveen Kumar Feb 07 '19 at 12:22
  • It probably the issue with your `xml`. Please check it closely. You can validate it from [xmlValidation](https://www.xmlvalidation.com/example/) – Pankaj Feb 08 '19 at 04:35