1

I'm trying to convert Excel to nested XML and could not succeed as expected.

Here is my code.

import openpyxl
import xml.etree.ElementTree as etree

# reading data from the source, xls
wb1 = openpyxl.load_workbook(filename='C:\GSH\parent_child.xlsx')
ws1 = wb1.get_sheet_by_name('Sheet1')
row_max = ws1.max_row

# creating xml tree structure
root = etree.Element('Hierarchy')

# iterating through the xls and creating children based on the condition
for row_values in range(2, row_max+1):
    parent = etree.SubElement(root, 'parent')
    parent.text = ws1.cell(column=1, row=row_values).value
    root.append(parent)
    if (ws1.cell(column=1, row = row_values).value == ws1.cell(column=2, row = row_values-1).value):
        print("------Inside if condition")
        print(ws1.cell(column=2, row=row_values).value)
        child = etree.SubElement(parent, 'child')
        child.text = ws1.cell(column=2, row=row_values).value
        parent.append(child)
        print("-------Inside if condition")
    tree = etree.ElementTree(root)

tree.write('C:\GSH\gsh.xml')

I am getting XML like this..

enter image description here

However, my XML should look like this.

enter image description here

Any suggestions, please.

enter image description here

The above is the source XLS from which I am working on.

Vishy
  • 281
  • 1
  • 2
  • 9

1 Answers1

1

You can set variable name instead of parent and child. This code is only part of your list and seems tricky but works fine. d[child[i]].text = " " is only to show both sides of tags. For making var in loop with dictionary, please refer this.

import xml.etree.ElementTree as ET

India = ET.Element('India')  # set root
parent = ['India', 'Telangana', 'Telangana', 'Telangana','Nalgonda']  # parent list
child = ['Telangana', 'Cyberabad', 'Warangal','Nalgonda','BusStation']  # child list

d = {}  # use dictionary to define var in loop
d['India'] = India

for i in range(len(child)):

    for k, v in d.items():
        if k == parent[i]:
            pa = v
            break

    d[child[i]] = ET.SubElement(pa, child[i])
    d[child[i]].text = " "  # to get both side of tags

tree = ET.ElementTree(India)
tree.write('gsh.xml')

# <India>
# <Telangana>
# <Cyberabad> </Cyberabad>
# <Warangal> </Warangal>
# <Nalgonda>
# <BusStation> </BusStation>
# </Nalgonda>
# </Telangana>
# </India>
shimo
  • 2,156
  • 4
  • 17
  • 21
  • Thanks a ton, you saved my day..Can we get the values outside the tags, instead of them inside? I mean instead of , I am still struggling to get NelloreChittoor – Vishy Jan 05 '20 at 16:44
  • I have updated the xls source in original question. – Vishy Jan 05 '20 at 18:09
  • I don’t see what is your final results. In my example code, varname is str(varname). Use .text to put string inside a tag. Please try. – shimo Jan 06 '20 at 04:06
  • I have updated the answer on final results which should look like. I'm trying your suggestion and keep you posted.. – Vishy Jan 06 '20 at 05:27
  • I think you’ve far changed both question and expected answer. Still, you can do by expandibng my code. I’m afraid I won’t do anymore. – shimo Jan 06 '20 at 08:39
  • yeah, you are right. Thanks for the answer man. I will try doing something from here. – Vishy Jan 06 '20 at 09:13