I wondered if anyone could help. I am combining multiple excel files that have between 1-3 sheets. I want to combine these sheets into 3 dataframes and have done so using the below code:
all_workbook1 = pd.DataFrame()
all_workbook2 = pd.DataFrame()
all_workbook3 = pd.DataFrame()
for f in glob.glob("*.xlsx"):
dfworkbook1 = pd.read_excel(f, sheet_name="sheet1", usecols="B:AO")
dfworkbook1["Filename"] = "[" + os.path.basename(f) + "]"
all_workbook1 = all_workbook1.append(dfworkbook1,ignore_index=True)
dfworkbook2 = pd.read_excel(f, sheet_name="sheet2", usecols="B:AO")
dfworkbook2["Filename"] = "[" + os.path.basename(f) + "]"
all_workbook2 = all_workbook2.append(dfworkbook2,ignore_index=True)
dfworkbook3 = pd.read_excel(f, sheet_name="sheet3", usecols="B:AO")
dfworkbook3["Filename"] = "[" + os.path.basename(f) + "]"
all_workbook3 = all_workbook3.append(dfworkbook3,ignore_index=True)
When running this I can get the below error:
xlrd.biffh.XLRDError: No sheet named <'sheet3'>
I believe this is due to the fact that not all of my files have 'sheet3'. What is the best process to avoid this? I have tried to add code to the start that runs over the files and adds the missing sheets as a blank sheet but have been struggling with this.
Any help would be great. Thanks, Dan