0

I have an Excel template file which has columns as defined within an XML. Manually I can right click over the template then XML > import, and select the XML file, and finally save the file.

How can I perform this task automatically programming it in Python?

Example of XML file:

<DCPowerFlow>
    <branches>
        <branch>
            <busFrom name="bus_one" number="1" />
            <busTo name="bus_two" number="2" />
            <id>1</id>
            <rateA>1000</rateA>
            <resultPowerFlow>
                <probOverFlow>0.0</probOverFlow>
                <maxOverFlow>800</maxOverFlow>
            </resultPowerFlow>
        </branch>
        <branch>
            <busFrom name="bus_two" number="2" />
            <busTo name="bus_three" number="3" />
            <id>1</id>
            <rateA>1200</rateA>
            <resultPowerFlow>
                <probOverFlow>0.1</probOverFlow>
                <maxOverFlow>1300</maxOverFlow>
            </resultPowerFlow>
        </branch>
    </branches>
</DCPowerFlow>

To check for the manual task:

  1. Save the above example as an XML file.
  2. In order to create the Excel XLS template you can simply open the above XML example with Excel, ensure no data is on the template (delete data if within the import you added any) and save the file as XLS.
  3. Import the example XML file. Right click on the Excel template file created, then XML > Import, and select the XML example file.
  4. Save Template with data as a new XLS.

So what I need to do is to automate steps 3 and 4.

Cedric Zoppolo
  • 4,271
  • 6
  • 29
  • 59

2 Answers2

2

Have you tried using BeautifulSoup and Pandas? Note that the parser I use in the following script requires you to have lxml installed already. If you don't have it just pip install lxml.

import pandas as pd
from bs4 import BeautifulSoup

file = open("file.xml", 'r')
soup = BeautifulSoup(file, 'lxml')
df = pd.DataFrame({'ids': [x.text for x in soup.find_all('id')]})
df.to_excel('data.xls')

While you will have to figure out how you want to parse your file, that will give you the tools that you need. If you need more information about how to parse the file, try visiting the BeautifulSoup documentation. Using this code you can loop through all the files that you are interested in and parsing them into dataframes, then exporting them using the to_excel method.

  • I will give this a try. I have been using ElementTree for xml and pandas for reading excel files. – Cedric Zoppolo Mar 24 '17 at 20:39
  • The documentation says: 'I use Python 2.7 and Python 3.2 to develop Beautiful Soup, but it should work with other recent versions.' So hopefully it will, but if not, try using the BeautifulSoup3. Which reminds me, the most recent version should be installed with 'pip install bs4'. – Bradley Robinson Mar 24 '17 at 20:57
  • What if I want to import the whole structure of the xml file into the excel. Isn't there a simpler way? Or should I go bit by bit findind data and exporting to a dataframe to then save the excel file? – Cedric Zoppolo Mar 24 '17 at 21:35
  • Also I would add that in my Excel template I have already some conditional formatting I want to preserve. So I don´t want to write the excel, I need to import the data to an existing Excel file (i.e. template) – Cedric Zoppolo Mar 24 '17 at 22:03
  • I believe that what I attempt to do can be done with win32 module as stated on http://pythonexcels.com/basic-excel-driving-with-python/ – Cedric Zoppolo Apr 24 '17 at 19:01
0

Finally I could figure this out using win32com.client module.

I used the following code successfully to import an xml to an existing Excel xlsx file I use as template, and then save it with a different name:

import win32com.client as win32
excel = win32.gencache.EnsureDispatch('Excel.Application')
wb = excel.Workbooks.Open("D:/tmp/template.xlsx")
wb.XmlImport("D:/tmp/result.xml")
wb.SaveAs("D:\\tmp\\result.xlsx")
wb.Close()

Methods for Excel workbooks can be found here. Also I had to take into account that the saveAs method doesn't support forward slashes.

Cedric Zoppolo
  • 4,271
  • 6
  • 29
  • 59