I have an Excel workbook with 8 sheets in it. They all follow the same column header structure. The only difference is, the first sheet starts at row 1, but the rest of the sheets start at row 4.
I am trying to run a command like this, but this is giving me the wrong data - and I recognize that because I wrote sheet_name=None
this will give me issues as the sheets start at different rows:
df = pd.concat(pd.read_excel(xlsfile, sheet_name=None, skiprows=4), sort=True)
My next attempt was to:
frames = []
df = pd.read_excel(xlsfile, sheet_name='Questionnaire')
for sheet in TREND_SHEETS:
tmp = pd.read_excel(xlsfile, sheet_name=sheet, skiprows=4)
# append tmp dynamically to frames, then use concat frames at the end.. ugly
df.append(tmp, sort=False)
return df
Note, Questionnaire
is the first sheet in the Excel workbook. I know the logic here is off, and I do not want to create dynamic variables holding the 'tmp', appending it to a list, and then concatenating the frames.
How can I go about solving this, so that I achieve a dataframe which incorporates all the sheet data?