I have multiple excel files in my working directory. Few excel file contains multiple sheets, and few of the files contains single sheet. The issue that I have is that these excel files contains data starting at different rows instead of starting at row 1. Below snippet is example of excel file containing multiple sheets.As shown below Sheet 1 has data starting at row 3, whereas sheet 2 has data starting at row 5.Data are at random rows in other excel files. I would to do 3 below tasks.
- Create modified copy of the excel file such that the file contains data starting at row 1
- Split the sheet if the excel file contains multiple sheets.
- Finally, convert the modified sheet to CSV UTF-8
For task point 1 I referenced this link, using code provided by chip. I was able to get data for sheet 1 to move to row 2 in "modifiedsheet" (I can easily move it to row 1 by changing the number), but the issue that I am facing is this is manual approach and I would like to do it dynamically as I have multiple files with row all over the place.
Any help will be really appreciated! Thanks in advance for your time and efforts!
Sheet 1 Image
Sheet 2 Image
Python Code:
from openpyxl import Workbook
from openpyxl import load_workbook
wb = load_workbook("Myexcel_file1.xlsx")
ws1 = wb.active
ws2 = wb.create_sheet("modifiedSheet")
start_row = 3
start_col = 1
for row in ws1.iter_rows(min_row=start_row):
for cell in row:
# print(cell.value)
ws2.cell(row = start_row-2, column = start_col, value=cell.value) # start_row - 2 will assign the value to the same column up 2 rows
start_col += 1 # increment the column, for use of destination sheet
start_row += 1 # increment the column, for use of destination sheet
start_col = 1 # reset to first column after row processing
wb.save("modified.xlsx")