0

I am new to Python and I have no big experience with this language. I have a CSV file from where I have to get the data into an XML structure. I want to do it with Pandas and ElementTree.

I read a tutorial to do so, but I can't understand the structure of the code.

The CSV file looks something like this

test_name,health_feat,result
test_1,20,1
test_2,23,1
test_3,24,0
test_4,12,1
test_5,45,0
test_6,34,1
test_7,78,1
test_8,23,1
test_9,12,1
test_10,12,1

The final XML file should look like this, but I am not sure how to handle attributes when applying ElementTree:

<xml version = '1.0' encoding = 'UTF-8'>
    <Test Testname = 'test_1' >
        <Health_Feat>20</health_feat>
        <Result>1</Result>
    </Test>
    <Test Testname = 'test_2'>
        <Health_Feat>23</Healt_Feat>
        <Result>1</Result>
    </Test>
    <Test Testname = 'test_3'>
        <Health_Feat>24</Healt_Feat>
        <Result>0</Result>
    </Test>
    <Test Testname = 'test_4'>
        <Health_Feat>30</Healt_Feat>
        <Result>1</Result>
    </Test>
    <Test Testname = 'test_5'>
        <Health_Feat>12</Healt_Feat>
        <Result>1</Result>
    </Test>
    <Test Testname = 'test_6'>
        <Health_Feat>45</Healt_Feat>
        <Result>1</Result>
    </Test>
    <Test Testname = 'test_7'>
        <Health_Feat>34</Healt_Feat>
        <Result>0</Result>
    </Test>
    <Test Testname = 'test_8'>
        <Health_Feat>78</Healt_Feat>
        <Result>1</Result>
    </Test>
    <Test Testname = 'test_9'>
        <Health_Feat>23</Healt_Feat>
        <Result>1</Result>
    </Test>
    <Test Testname = 'test_10'>
        <Health_Feat>12</Healt_Feat>
        <Result>1</Result>
    </Test>
</Tests>

Currently I tried something like this, but I don't know how to tell the program which line to take from the csv.

import pandas as pd
from lxml import etree as et
import uuid

df = pd.read_csv('mytests.csv', sep = ',')

root = et.Element(Tests)

for index, row in df.iterrows():
    if row['test_name'] == 'test_1':
        Test = et.SubElement(root, 'Test')
        Test.attrib['fileUID']
        health_feat = et.subElement('health_feat')
        Result = et.subElement('Result')
    else:
        Tests = et.subElement(root, 'Tests')
        
et.ElementTree(root).write('mytests.xml', pretty_print = True, xml_declaration = True, encoding = 'UTF-8', standalone = None)
Jason Aller
  • 3,541
  • 28
  • 38
  • 38
Johannes
  • 25
  • 7
  • 4
    Hello, Johannes. Can you show us the code of your attempt and tell us what went differently from what you expected? – Bonifacio2 Aug 12 '19 at 11:42
  • possible duplicate https://stackoverflow.com/questions/41059264/simple-csv-to-xml-conversion-python – Shreyash Sharma Aug 12 '19 at 11:47
  • @Bonifacio2 I have not written that much so far. I read a tutorial on how to do it but they have a different structur in their xml. – Johannes Aug 12 '19 at 12:47

1 Answers1

0

Something like this:

import pandas as pd
df = pd.read_csv('your_csv.csv', sep=',')


def csv_to_xml(row):
    return """<Test Testname="%s">
    <Health_Feat>%s</Health_Feat>
    <Result>%s</Result>
    </Test>""" % (row.test_name, row.health_Feat, row.Result)

and call the function for every row of your csv in a for loop

Kostas Charitidis
  • 2,991
  • 1
  • 12
  • 23