1

is there an easy way to map values of a dictionary with a xml file? I have a XML template and need to map all values of the dict to the xml. The dict keys are the text element in the xml file.

dict = { country1_rank : 1, country1_year : 2008, country1_gdppc : 141100, country2_rank : 4, country2_year : 2011, country2_gdppc : 59900, country3_rank : 69, country3_year : 2011, country3_gdppc : 13600 }

<?xml version="1.0"?>
<data>
    <country1>
        <rank>country1_rank</rank>
        <year>country1_year</year>
        <gdppc>country1_gdppc</gdppc>
    </country1>
    <country2>
        <rank>country2_rank</rank>
        <year>country2_year</year>
        <gdppc>country2_gdppc</gdppc>
    </country2>
    <country3>
        <rank>country3_rank</rank>
        <year>country3_year</year>
        <gdppc>country3_gdppc</gdppc>
    </country3>
</data>

In the end the output should look like this:

<?xml version="1.0"?>
<data>
    <country1>
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
    </country1>
    <country2>
        <rank>4</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
    </country2>
    <country3>
        <rank>69</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
    </country3>
</data>
phappy
  • 65
  • 6

2 Answers2

1

You can use python xml module (documentation). For example this way:

# Define data
value_dict = { 
    "country1_rank" : 1, 
    "country1_year" : 2008, 
    "country1_gdppc" : 141100, 
    "country2_rank" : 4, 
    "country2_year" : 2011, 
    "country2_gdppc" : 59900, 
    "country3_rank" : 69, 
    "country3_year" : 2011, 
    "country3_gdppc" : 13600 
}

xml_str = """<?xml version="1.0"?>
<data>
    <country1>
        <rank>country1_rank</rank>
        <year>country1_year</year>
        <gdppc>country1_gdppc</gdppc>
    </country1>
    <country2>
        <rank>country2_rank</rank>
        <year>country2_year</year>
        <gdppc>country2_gdppc</gdppc>
    </country2>
    <country3>
        <rank>country3_rank</rank>
        <year>country3_year</year>
        <gdppc>country3_gdppc</gdppc>
    </country3>
</data>
"""
import xml.etree.ElementTree as ET

# Parse data
xmltree = ET.fromstring(xml_str)

for country in xmltree:
    for attr in country:
        attr.text = str(value_dict[attr.text])  # Overwrite keys with values from dict

# finally, create a string from the tree object
xml_str_modified = ET.tostring(xmltree, encoding="unicode")

print(xml_str_modified)

will print

<data>
    <country1>
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
    </country1>
    <country2>
        <rank>4</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
    </country2>
    <country3>
        <rank>69</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
    </country3>
</data>

You can also format the output, if desired. Also, the xml module lets you load the XML data directly from a file.

Note that this specific solution assumes your values are all at the same level in the XML file. If you want to do this at arbitrary depths of the XML, the find or findall method might help you.

Finally, instead of writing the value_dict keys as country1_rank, you could also use a nested dict, such that {"country1": {"rank": 1, [...]}, "country2": [...]}. This way you have less redundancy in your template XML. One line of my code would change then:

attr.text = str(value_dict[country.tag][attr.text])

Note how I first index by country.tag, before attr.text.

André
  • 1,034
  • 9
  • 19
1

You could parse the xml code as ElementTree and then access individual nodes with tree.find(). If you have freedom in choosing the dict keys, you could format them as path.

import xml.etree.ElementTree
import xml.dom.minidom

mapping = { 'country1_rank' : 1, 'country1_year' : 2008, 'country1_gdppc' : 141100, 'country2_rank' : 4, 'country2_year' : 2011, 'country2_gdppc' : 59900, 'country3_rank' : 69, 'country3_year' : 2011, 'country3_gdppc' : 13600 }

template = '''<?xml version="1.0"?>
<data>
    <country1>
        <rank>country1_rank</rank>
        <year>country1_year</year>
        <gdppc>country1_gdppc</gdppc>
    </country1>
    <country2>
        <rank>country2_rank</rank>
        <year>country2_year</year>
        <gdppc>country2_gdppc</gdppc>
    </country2>
    <country3>
        <rank>country3_rank</rank>
        <year>country3_year</year>
        <gdppc>country3_gdppc</gdppc>
    </country3>
</data>
'''

tree = xml.etree.ElementTree.fromstring(template)

for key in mapping.keys():
    element = '/'.join(['./'] + key.split('_'))
    tree.find(element).text = str(mapping[key])

print(xml.dom.minidom.parseString(xml.etree.ElementTree.tostring(tree, encoding='unicode', method='xml')).toprettyxml(newl=''))
'''Output:
<?xml version="1.0" ?><data>
        <country1>
                        <rank>1</rank>
                        <year>2008</year>
                        <gdppc>141100</gdppc>
        </country1>
        <country2>
                        <rank>4</rank>
                        <year>2011</year>
                        <gdppc>59900</gdppc>
        </country2>
        <country3>
                        <rank>69</rank>
                        <year>2011</year>
                        <gdppc>13600</gdppc>
        </country3>
</data>
'''
yths
  • 61
  • 7