0

This is my first time here. I'm a beginner in python. I hope my question makes sense and if there is any clarification needed please let me know.

I have a project in which I'm pulling in data using an API to an excel file. Once I have this data I'm splitting it on a unique column (unique identifier) to split the records and have their own corresponding excel file. So, for example I have 10 customers that answered some questions which would then end up having their own 10 excel files with those questions as vars/columns. This part is also done (see code below).

Now, I have a master excel template that has some sheets in it (these are follow up forms, etc). I'm trying to transfer those individual customer excel files (create a new sheet within that template) to that template. I still want to have those 10 individual excel files but they now have those template sheets alongside the individual's data sheet.

So I want to iterate through any # of excel customer files (API data split on unique customer ID) in the folder and have each file transfer as a sheet to that template and do the process over. So Customer 1 sheet would transfer to that template as a new sheet. Then same thing would happen for Customer 2 sheet until all the individual customer excel files now have those template sheets in them.

Please note that the template has formatting in it and it is empty. I'm not trying to merge or append data. Which means that I cannot read in all the template sheets as a dataframe. In essence I'm trying to put customer sheet into that template then repeat the process where I finally end up with 10 or any # of customer files that now have that template in them.

Sorry for the long explanation and if there is anything I'm missing please let me know.

Code that creates those individual excel files

# read in excel data
temp = pd.read_excel('Customer_data.xlsx')

#use ResponseID to create unique values
split_values = temp['CustomerID'].unique()

#go through each unique value to create corresponding separate files
for value in split_values:
    df1 = temp[temp['CustomerID'] == value]
    output_file_name = "Customer_" + str(value) + "_file.xlsx"
    df1.to_excel(output_file_name, index=False)

A code using xlwings to keep formatting and join an individual customer excel file to the template (This does it for one customer file only. I want to repeat the process for those 10 or any # of individual files in the folder)

import xlwings as xw

df = pd.read_excel('Customer_1_file.xlsx')

# Use dict for sheet/df mapping
sheet_df_mapping = {'Customer': df}

# Open Excel in background
with xw.App(visible=False) as app:
    wb = app.books.open('Template.xlsx')

# List of current worksheet names
current_sheets = [sheet.name for sheet in wb.sheets]

# Iterate over sheet/df mapping 
# If sheet already exist, overwrite current cotent. Else, add new sheet
for sheet_name in sheet_df_mapping.keys():
    if sheet_name in current_sheets:
        wb.sheets(sheet_name).range('A1').options(index=False).value = sheet_df_mapping.get(sheet_name)
    else:
        new_sheet = wb.sheets.add(after=wb.sheets.count)
        new_sheet.range('A1').options(index=False).value = sheet_df_mapping.get(sheet_name)
        new_sheet.name = sheet_name
wb.save()

UPDATE

So after doing some trial and error I was able to get some part of the code working. I now have those 10 customer files that each have those template forms in them as new added sheets. But I'm losing formatting on the forms since it is being loaded with pandas. Is there anyway to retain the format (color, charts, etc.) of the forms while they are being added to the customer files? Below is the new code.

import glob
import xlwings as xw
import pandas as pd

# the customer files
excel_names = glob.glob('*.xlsx')

# iterating through the customer files
for excel in excel_names:
    df = pd.read_excel('Template.xlsx')
    # template forms that will be added to the customer files
    sheet_df_mapping = {'Form1': df,
                        'Form2': df}
            
    with xw.App(visible=False) as app:
        # open up those customer excel files
        wb = app.books.open(excel)

        # List of current worksheet names
        current_sheets = [sheet.name for sheet in wb.sheets]
        
        # Iterate over sheet/df mapping 
        # If sheet already exists, overwrite current cotent. Else, add new sheet
        for sheet_name in sheet_df_mapping.keys():
            if sheet_name in current_sheets:
                wb.sheets(sheet_name).range('A1').options(index=False).value = sheet_df_mapping.get(sheet_name)
            else:
                new_sheet = wb.sheets.add(after=wb.sheets.count)
                new_sheet.range('A1').options(index=False).value = sheet_df_mapping.get(sheet_name)
                new_sheet.name = sheet_name
        wb.save()

Some pics for the process and what I want:

This is the full customer file which is first initially split by their ID#

The template which has those 2 forms with the color and format

After splitting by ID and looping through customer ID's this is Customer 1 with the forms added (see below sheets)

Customer 1 has their own 2 forms with full formatting and this should happen with all customers but I'm losing the formatting

The file directory.

Ali
  • 1
  • 2
  • Lots and lots of q&a on here, even some have example code you can edit to match your situation. See this for one: https://stackoverflow.com/q/30575923/4961700 – Solar Mike Oct 08 '22 at 07:31
  • Thanks for the helpful link but it's not quite what I want to do. – Ali Oct 10 '22 at 09:46
  • That is the benefit of what people post on here, YOU can edit it to suit YOUR situation. – Solar Mike Oct 10 '22 at 10:33

1 Answers1

0

I believe the following will work for you, if I have interpreted your question correctly and you want to copy the data from each "Customer_x_file.xlsx" file to the "Template.xlsx" file, overwriting any sheet that already exists, or creating a new sheet otherwise.

Note that I have removed the dictionary of the df, and instead have just overwritten df each for each customer file in the folder.

import os
from glob import glob
import pandas as pd
import xlwings as xw

# folder path where the .xlsx files are located
folder_path = "folder_path"
# the start of the Customer file names (to filter the files in folder)
file_prefix = "Customer"
# the extension of the Customer files (to filter the files in folder)
file_extension = ".xlsx"
# the end of the file name that should be removed (string after the customer number, as per your example)
file_suffix = "_file"

# add directory with files, if python script not located in there.
os.chdir(folder_path)

# Open Excel in background
with xw.App(visible=False) as app:
    wb = app.books.open('Template.xlsx')

# List of current worksheet names
current_sheets = [sheet.name for sheet in wb.sheets]

# list of files that are in the folder that start with "Customer", and are type .xlsx
cust_files = glob(os.path.join(folder_path, file_prefix+"*"+file_extension))

# sort files by customer number (which is in the file name)
cust_files.sort(key=lambda x: int(x.rstrip(file_suffix+file_extension).split(file_prefix+"_")[1]))

# loop through all files in list
for file in cust_files:
    # create a dataframe of the data in the file
    df = pd.read_excel(file)
    # sheet_name is the same as the Customer's file name, but without the "_file" suffix or .xlsx
    sheet_name = os.path.relpath(file).rstrip(file_extension).rstrip(file_suffix)
    # if the sheet_name is in the current_sheets, over-write
    if sheet_name in current_sheets:
        # clear the contents of the sheet, in case size of data before is larger than new df
        wb.sheets(sheet_name).clear_contents()
        # paste new data from A1
        wb.sheets(sheet_name).range('A1').options(index=False).value = df
    else:
        # if sheet_name doesn't exist, create new sheet
        new_sheet = wb.sheets.add(after=wb.sheets.count)
        # paste new data from A1
        new_sheet.range('A1').options(index=False).value = df
        # name sheet as sheet_name
        new_sheet.name = sheet_name
# save workbook
wb.save()
Rawson
  • 2,637
  • 1
  • 5
  • 14
  • Thank you for the help. I was able to get one part working but am now losing the formatting for the forms sheets in the template. I have updated my code. If anyone knows how to retain the format of the forms please let me know. – Ali Oct 10 '22 at 09:45