I have an Excel (.xlsx) file with 40 or so sheets. Each sheet has the same structure, but contains different data. I want to extract information from each sheet and consolidate it into one, with the information from each sheet stacked one over the other. The information I need to extract from each are two:
- Sheet name, found always in cell E3
- Cell area of interest, found always between rows 72-85 and columns E-V
This extracted information would be pasted in rows 2-15 of the consolidated sheet with the sheet name in one column and all other information in the columns next to it. Then, information from the next sheet extracted would be pasted in rows 16-29 and so forth.
I’m new to Python and I am running Python 3.6.1. I have found solutions on how to combine Excel files using Python, but nothing for extracting pieces of information from specified cells in an Excel sheets.
Any guidance would be very helpful.
Update 1: I managed to load into a data frame the area of one of my sheets using.
import pandas as pd
cis = pd.ExcelFile(r"C:\mypath\myfile.xlsx")
df1 = cis.parse('1. Consumers', skiprows=[0,1,2,3,4,5,6],parse_cols=[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]).drop([13])
However, I still have to add the sheet name to the data frame and then run this same loop across all of the sheets to then finally append the information from each sheet together in one dataframe/file.