0

I need to transform a Pandas DF in a XML file. I used this approach How do convert a pandas/dataframe to XML? and I got something. However, the output, still miss of the root element since (I think)the function is looping through each row.

Here the Python script

import pandas as pd 

# Import the excel file and call it xls_file 
excel_file = pd.ExcelFile('xlsx_files/grp-DocumentFilesArchivar-Places-BatchIslerProjects-20190905-aw.xlsx') 

# View the excel_file's sheet names 
print(excel_file.sheet_names) 

# Load the excel_file's Sheet1 as a dataframe 
df = excel_file.parse('Sheet1') 
print(df) 

def func(row):

    xml = ['<item>']
    for field in row.index:
        xml.append('  <{0}>{1}</{0}>'.format(field, row[field]))
    xml.append('</item>')

    return '\n'.join(xml)

with open('output_file/test.xml', 'w') as f:

    print('\n'.join(df.apply(func, axis=1)), file=f)

Here the output

<item>
  <MH_Marker>Charge_201806
Gesamtbestand
Charge_gelbgruen
Charge_Isler_Projects</MH_Marker>
  <T_ID>550238</T_ID>
  <T_UUID>BCC3E70C-54BF-4218-9376-3AB94111DDAD</T_UUID>
  <MH_DocType>Konvolut</MH_DocType>
  <MH_id_RoleUUID>927BD345-0CE7-42B6-B0C8-56628423C612</MH_id_RoleUUID>
  <MH_RoleName>Objektort</MH_RoleName>
  <MH_UUID_GRP>43E9BF33-48E0-438A-8A6F-2139507F2075</MH_UUID_GRP>
</item>

What I would like to have:

<?xml version="1.0" ?>
<root>

 <item>
  <MH_Marker>Charge_201806
  Gesamtbestand
  Charge_gelbgruen
  Charge_Isler_Projects</MH_Marker>
  <T_ID>550238</T_ID>
  <T_UUID>BCC3E70C-54BF-4218-9376-3AB94111DDAD</T_UUID>
  <MH_DocType>Konvolut</MH_DocType>
  <MH_id_RoleUUID>927BD345-0CE7-42B6-B0C8-56628423C612</MH_id_RoleUUID>
  <MH_RoleName>Objektort</MH_RoleName>
  <MH_UUID_GRP>43E9BF33-48E0-438A-8A6F-2139507F2075</MH_UUID_GRP>
 </item>  

</root>

Suggestions?

Thanks!

Pelide
  • 468
  • 1
  • 4
  • 19

1 Answers1

0

Well, an easy solution is to wrap your current xml into the final xml you wish to output. Save your initial output as a variable corexml(replace below):

from string import Template
corexml='XXXX'

xml_wrap = Template('<?xml version="1.0" ?><root>$corexml</root>')
xml = xml_wrap.substitute({'corexml' : corexml})
ParalysisByAnalysis
  • 703
  • 1
  • 4
  • 16