0

**I have a XML file where i need to read it and save each column to an excel file. Can someone pls help .

I have some lines after the declare statement , But I want to parse from the table1 till /table1 can someone help me ?

**<?xml version="1.0" encoding="Metadata" ?>
    <DECLARE  lmsid ="asdhgh"
     ...........
    </table1 name ="employee table" name ="E1 Enterprises" refid ="201"
     <data id = "ABC" emp = "dt">
     <country id ="m1" name =dt1">
     <rank text> "data"</rank text>
     <rank textd> "direction"</rank textd>
     <reference>
     <ref id ="9900m" id1="1000" ref="URL">
     </reference>
     </country>
    <data id = "xyz" emp = "dt1">
    <country id ="m2" name =dt2">
    <rank text> "data1"</rank text>
    <rank textd> "direction1"</rank textd>
    <reference>
    <ref id ="9900m" id1="2000" ref="URL">
    </reference>
    </country>
    </data id>
    ....
    </table1>
    </table1 name ="Manager table" name ="E1 Enterprises" refid ="202"
    <data id = "ARZ" emp = "dt">
    <country id ="m1" name =dt1">
     <rank text> "data"</rank text>
     <rank textd> "direction"</rank textd>
     <reference>
     <ref id ="9900m" id1="1000" ref="URL">
     </reference>
     </country>
     <data id = "QNC" emp = "dt1">
     <country id ="m2" name =dt2">
     <rank text> "data1"</rank text>
     <rank textd> "direction1"</rank textd>
     <reference>
     <ref id ="9900m" id1="2000" ref="URL">
     </reference>
     </country>
     </data id>
      ....
     </table1>
...

Thanks Aarush **

Aarush
  • 37
  • 7
  • Does this answer your question? [How do I parse XML in Python?](https://stackoverflow.com/questions/1912434/how-do-i-parse-xml-in-python) – siralexsir88 Jun 18 '20 at 18:56
  • Please show that you did some research and made an earnest attempt. What have you tried so far? – mzjn Jun 18 '20 at 19:05
  • I tried with Element tree, im able to get the values but im not sure how to save in excel also how to start the parsing from Table1. – Aarush Jun 18 '20 at 19:35

2 Answers2

0

So I think you can just use BeautifulSoup to parse XML things. I found this snippet of code online

# Import BeautifulSoup
from bs4 import BeautifulSoup

content = []

# Read the XML file
with open("sample.xml", "r") as file:

    # Read each line in the file, readlines() returns a list of lines
    content = file.readlines()

    # Combine the lines in the list into a string
    content = "".join(content)
    soup = BeautifulSoup(content, "lxml")

    #Do things

BS4 can find the xml tags pretty easy. Its docs are extensive, but something like soup.find('data', id='xyz') if you were looking for that info. Then just export to csv with pandas or csv module.

tjblue
  • 33
  • 1
  • 5
0

Not sure what you mean by save each column. An XML has: - tag name - attributes - text

You can use the xml.dom.minidom module

>>> s = '<t><a name="1"></a><a name="2"></a></t>'
>>> x = xml.dom.minidom.parseString(s)
>>> a = x.getElementsByTagName("a")
>>> for i in a:
...     print i.getAttribute("name")
...     
1
2

You can also parse a .xml file. x = xml.dom.minidom.parse("c:\xmlFile.xml")

See more detail in the documentation:x = https://docs.python.org/2/library/xml.dom.minidom.html

Once you have the value you want to save into your excel you can run a SQL statement with pyodbc and the microsoft odbc driver(Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)}):

import pyodbc

connection = pyodbc.connect("Driver={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)}; readonly=0; DBQ=C:\yourfileName.xlsx")

cursor = connection.cursor()
sql = "insert into [Sheet1$] (col1,col2) values (val1,val2)"
cursor.execute(sql)
Vivian Yung
  • 79
  • 1
  • 9
  • My output should look like in excel table1 name refid data emp countryID name .... employee table E1 Enterprises 201 ABC dt m1 dt1 each tag and its values to be populated to an excel sheet , Thats what i meant. – Aarush Jun 18 '20 at 19:30
  • i've edit the above and added an example of how to insert data into excel via sql – Vivian Yung Jun 18 '20 at 20:26