0

I have an excel workbook with 3 worksheets, they are called "Z_scores", "Alpha" and "Rho" respectively.

In the future, this workbook will increase as the number of models and their corresponding parameters are stored here.

In my function I am looking to import each worksheet individually and save it to a dataframe, the name of the dataframe should be decided by the name of the worksheet.

So far, I have this function but I am not able to dynamically name the dataframe and I am unsure what should be written in the return statement

FYI: The import identifier function is simply a way of scanning in worksheet names and those with the identifier present should not be inserted e.g. putting a single blankspace at the beginning of the worksheet name will prevent the worksheet being imported.

#import libraries
import pandas as pd

#define function
def import_excel(filename, import_identifier):
    #Create dataframe of the excel
    df = pd.read_excel('Excel.xlsx')
    # this will read the first sheet into df
    xls = pd.ExcelFile('Excel.xlsx')
    #Delete all worksheet that begin with the import_identifier    
    worksheets = []
    for x in all_worksheets:
        if x[0] != import_identifier:
            worksheets.append(x)
    #Loop through the sheets which are flagged for importing and import each
    #sheet individually into a dataframe
    for sheetname in worksheets:
        #Encase the sheetname in quotation marks to satisfy the sheetname function in read_excel
        sheetname_macro_str = '"{}"'.format(sheetname_macro)
        #Import the workbook and save to dynamically named dataframe
        sheetname_macro = pd.read_excel(xls, sheetname=sheetname_macro_str)

    #What would I return here, how do I ensure the data frames are stored?
    #return
78282219
  • 159
  • 1
  • 12
  • Possible duplicate of [Using Pandas to pd.read\_excel() for multiple worksheets of the same workbook](https://stackoverflow.com/questions/26521266/using-pandas-to-pd-read-excel-for-multiple-worksheets-of-the-same-workbook) – Erfan Nov 13 '19 at 16:33
  • See second answer of the link above. – Erfan Nov 13 '19 at 16:34

1 Answers1

0

As you can read in this thread, a DataFrame object can't reliably be "named". Usually, the Python variable to which the object is assigned will be what describes or differentiates it.

If you're looking to store references to multiple DataFrames in your code, you'll probably want to create a list, tuple, or dictionary for that (outside the scope of your import function). If you use a dictionary, then you can use your worksheet names as keys:

dataframes = {}
dataframes[friendly_sheetname] = dataframe_from_sheet
Evan
  • 231
  • 2
  • 8