0

I am new to XML Element Tree and I am wondering how to properly convert the values I have in a pandas dataframe into XML values.

Here is a sample of what the values of my dataframe look like:

 Id         Email           State      Country      LastName
Mjkx  sealover71@yahoo.com   CA     United States    Withers

I know what I have below probably isn't close to working but I figured what I want to happen is possible with df.iterrows(). How do I create XML values for each column in my dataframe with XML Element Tree?

for row in df.iterrows():
    id = ET.SubElement('ID', 'id')
    lastname = ET.SubElement('LastName', 'lastname')
    email = ET.SubElement('Email', 'email')
    state = ET.SubElement('State', 'state')
    country = ET.SubElement('Country', 'country')


print(ET.tostring(row, pretty_print=True).decode('utf-8'))

Here is what the desired output should look like to post into our crm:

<crc>
  <lead>
    <id>Mjkx</id>
    <lastname>Withers</lastname>
    <email>sealover71@yahoo.com</email>
    <state>CA</state>
    <country>United States</country>
  </lead>
</crc>

Thanks for your help!

Preston G
  • 69
  • 6

1 Answers1

1

You had the right idea with ElementTree

from io import StringIO
import pandas as pd
from xml.etree import ElementTree


# in-place prettyprint formatter
# https://web.archive.org/web/20200703234431/http://effbot.org/zone/element-lib.htm
def indent(elem, level=0):
    i = "\n" + level * "  "
    if len(elem):
        if not elem.text or not elem.text.strip():
            elem.text = i + "  "
        if not elem.tail or not elem.tail.strip():
            elem.tail = i
        for elem in elem:
            indent(elem, level + 1)
        if not elem.tail or not elem.tail.strip():
            elem.tail = i
    else:
        if level and (not elem.tail or not elem.tail.strip()):
            elem.tail = i


data = """\
Id,Email,State,Country,LastName
Mjkx,sealover71@yahoo.com,CA,United States,Withers
Asdf,foo@yahoo.com,IL,United States,Capone
Qwer,bar@yahoo.com,NV,United States,Sinatra
"""

f = StringIO(data)
df = pd.read_csv(f)

#print(df)

root = ElementTree.Element('crc')

for _, r in df.iterrows():
    lead = ElementTree.SubElement(root, 'lead')

    for c in df.columns:
        e = ElementTree.SubElement(lead, c)
        e.text = str(r[c])

indent(root)
x = ElementTree.tostring(root)

print(x.decode('UTF-8'))