I am new to Python and I have no big experience with this language. I have a CSV file from where I have to get the data into an XML structure.
I want to do it with Pandas and ElementTree
.
I read a tutorial to do so, but I can't understand the structure of the code.
The CSV file looks something like this
test_name,health_feat,result
test_1,20,1
test_2,23,1
test_3,24,0
test_4,12,1
test_5,45,0
test_6,34,1
test_7,78,1
test_8,23,1
test_9,12,1
test_10,12,1
The final XML file should look like this, but I am not sure how to handle attributes when applying ElementTree
:
<xml version = '1.0' encoding = 'UTF-8'>
<Test Testname = 'test_1' >
<Health_Feat>20</health_feat>
<Result>1</Result>
</Test>
<Test Testname = 'test_2'>
<Health_Feat>23</Healt_Feat>
<Result>1</Result>
</Test>
<Test Testname = 'test_3'>
<Health_Feat>24</Healt_Feat>
<Result>0</Result>
</Test>
<Test Testname = 'test_4'>
<Health_Feat>30</Healt_Feat>
<Result>1</Result>
</Test>
<Test Testname = 'test_5'>
<Health_Feat>12</Healt_Feat>
<Result>1</Result>
</Test>
<Test Testname = 'test_6'>
<Health_Feat>45</Healt_Feat>
<Result>1</Result>
</Test>
<Test Testname = 'test_7'>
<Health_Feat>34</Healt_Feat>
<Result>0</Result>
</Test>
<Test Testname = 'test_8'>
<Health_Feat>78</Healt_Feat>
<Result>1</Result>
</Test>
<Test Testname = 'test_9'>
<Health_Feat>23</Healt_Feat>
<Result>1</Result>
</Test>
<Test Testname = 'test_10'>
<Health_Feat>12</Healt_Feat>
<Result>1</Result>
</Test>
</Tests>
Currently I tried something like this, but I don't know how to tell the program which line to take from the csv.
import pandas as pd
from lxml import etree as et
import uuid
df = pd.read_csv('mytests.csv', sep = ',')
root = et.Element(Tests)
for index, row in df.iterrows():
if row['test_name'] == 'test_1':
Test = et.SubElement(root, 'Test')
Test.attrib['fileUID']
health_feat = et.subElement('health_feat')
Result = et.subElement('Result')
else:
Tests = et.subElement(root, 'Tests')
et.ElementTree(root).write('mytests.xml', pretty_print = True, xml_declaration = True, encoding = 'UTF-8', standalone = None)