You can use pd.read_excel
to read in the workbook with the data you want, in your case that is 'test1.xlsx'. You could then utilize openpyxl.load_workbook()
to open an existing workbook with the header, in your case that is 'merge1.xlsx'. Finally you could save the new workbbok by a new name ('test3.xlsx') without changing the two existing workbooks.
Below I've provided a fully reproducible example of how you can do this. To make this example fully reproducible, I create 'merge1.xlsx' and 'test1.xlsx'.
Please note that if in your 'merge1.xlsx', if you only have the header that you want and nothing else in the file, you can make use of the two lines I've left commented out below. This would just append your data from 'test1.xlsx' to the header in 'merge1.xlsx'. If this is the case then you can get rid of the two for llops at the end. Otherwise as in my example it's a bit more complicated.
In creating 'test3.xlsx', we loop through each row and we determine how many columns there are using len(df3.columns)
. In my example this is equal to two but this code would also work for a greater number of columns.
import pandas as pd
from openpyxl import load_workbook
from openpyxl.utils.dataframe import dataframe_to_rows
df1 = pd.DataFrame()
writer = pd.ExcelWriter('merge1.xlsx') #xlsxwriter engine
df1.to_excel(writer, sheet_name='Sheet1')
ws = writer.sheets['Sheet1']
ws.merge_range('A1:C1', 'This is a merged cell')
ws.write('A3', 'some string I might not want in other workbooks')
writer.save()
df2 = pd.DataFrame({'col_1': [1,2,3,4,5,6], 'col_2': ['A','B','C','D','E','F']})
writer = pd.ExcelWriter('test1.xlsx')
df2.to_excel(writer, sheet_name='Sheet1')
writer.save()
df3 = pd.read_excel('test1.xlsx')
wb = load_workbook('merge1.xlsx')
ws = wb['Sheet1']
#for row in dataframe_to_rows(df3):
# ws.append(row)
column = 2
for item in list(df3.columns.values):
ws.cell(2, column=column).value = str(item)
column = column + 1
for row_index, row in df3.iterrows():
ws.cell(row=row_index+3, column=1).value = row_index #comment out to remove index
for i in range(0, len(df3.columns)):
ws.cell(row=row_index+3, column=i+2).value = row[i]
wb.save("test3.xlsx")
Expected Output of the 3 Workbooks:
