0

I am needing to take an Excel file, transform the data and then save the transformed data in another sheet of the same Excel file. As of now, I am doing this by manually importing the Excel source sheet into a table within Microsoft SQL Server Management Studio, running my SQL script to transform the data and create a view of it. I then simply copy and paste this transformed data from the view back into a second sheet of the Excel file.

I am looking to automate this process using Python, but am very lost as to how to get started. I have seen packages such as pyodbc and pymysql which I could use to execute my script, but it is the importing/exporting the data to/from Excel that I cannot figure out.

Can either of these packages be integrated with SSMS to which the data can be imported/exported from a specified Excel file? Can SSMS be avoided altogether somehow by importing the Excel data into a table that my SQL script can then execute on?

Any suggestions are very welcome.

KOB
  • 4,084
  • 9
  • 44
  • 88
  • Does your SQL rely on any other tables other than the data you are loading in from your excel sheet? – jhole89 Jun 23 '17 at 13:25
  • Yes I use some reference tables for mappings. These are also originally imported from Excel. – KOB Jun 23 '17 at 13:31

2 Answers2

0

I am not really sure about the performances/features of most XLSX python libraries. I would go CSV for SQL interoperability, but if you are stuck with XLSX format I can say that XLSXWriter python module is ok for exports, but I could not recommend you any strong library that would manage import, export all with data schemes compatible with SQL...

However, I found out that it seems possible to execute Python WITHIN Excel, which might prove useful:

Calling python script from excel/vba

Fabien
  • 4,862
  • 2
  • 19
  • 33
0

As I understand it there are a couple of things to consider here:

  1. Data in the Excel file.
  2. Reference data on SQL server.
  3. SQL transformations.

I'd suggest just doing it all in python using pandas like so:

import pandas
from sqlalchemy import create_engine

# step1 - load excel sheet to dataframe
excel_data_df = pandas.read_excel('/path/to/excel/file.xls', sheetname='someSheetName')

# step2 - load ref data to dataframe
ServerName = "myserver"
Database = "mydatabase"
TableName = "mytable"

engine = create_engine('mssql+pyodbc://' + ServerName + '/' + Database)
conn = engine.connect()

sql_table1_df = pandas.read_sql_table(TableName, con=conn)

# step3 - do your sql operations in pandas dataframes. For example join the two together:
result = pandas.merge(excel_data_df, sql_table1_df, how='inner', on=['key1','key2'])

# Finish by writing back to excel
result.to_excel('path/to/excel/file.xls', sheet_name='someOtherSheet')

I'm unable to test this at the moment so there may be a couple of bugs but the overall approach should work.

jhole89
  • 718
  • 9
  • 28
  • This seems like a good solution, thank you. The one problem I had was that this project actually has several different types of Excel files that need to be transformed on a regular basis - each with different transformation rules written as SQL scripts. It would be a substantial task to translate all of these SQL scripts into the pandas equivalent. However, I have just come across the pandasql package which can run SQL queries on pandas dataframes. – KOB Jun 23 '17 at 14:52
  • Ok, you didnt mention that in the OP so i assumed it was only 1 excel file. Where are you getting these excel files from anyway, an alternative would be to just keep them in the database and do all your work there. What is the purpose of the excel export at the end anyway? – jhole89 Jun 23 '17 at 14:57
  • The files are being transformed to be ported from one database shema to another. They are sent to us as Excel files. – KOB Jun 23 '17 at 15:02
  • Ok, well i dont think this is really the right approach to do that, but that's moving away from the original question. If this answers your OP then please accept. – jhole89 Jun 25 '17 at 17:31
  • Yes it did for the most part. Apologies, accepted your answer now – KOB Jul 27 '17 at 10:11