0

I'm working on an assignment and I'd like to optimise my code a little. Basically what I'm doing is reading .xlsx files which have three sheets each. For context - One is "Insurance info", second is "New insurances", third is "resignations". The code I currently have to load the file is following:

###Info
InsuranceInfo1 = pd.read_excel("First.xlsx", sheet_name='Info', header=None).dropna(axis = 0, how='all')
InsuranceInfo2 = pd.read_excel("Second.xlsx", sheet_name='Info', header=None).dropna(axis = 0, how='all')
InsuranceInfo3 = pd.read_excel("Third.xlsx", sheet_name='Info', header=None).dropna(axis = 0, how='all')

###New Insurances
New_Insurances1 = pd.read_excel("First.xlsx", sheet_name='New', header=None)
New_Insurances2 = pd.read_excel("Second.xlsx", sheet_name='New', header=None)
New_Insurances3 = pd.read_excel("Third.xlsx", sheet_name='New', header=None)

###Resignations
Resignations1 = pd.read_excel("First.xlsx", sheet_name='Resignations', header=None)
Resignations2 = pd.read_excel("Second.xlsx", sheet_name='Resignations', header=None)
Resignations3 = pd.read_excel("Third.xlsx", sheet_name='Resignations', header=None)

Now, with three files this would be somewhat acceptable and easy to go through. The problem is however, that I would like to read in 5,6 or more files. In order to do that, I wanted to write a function that would create the variables InsuranceInfo1, NewInsurances1, Resignations1 at once by just writing the file name.

def readfile(filename):

InsuranceInfo1 = pd.read_excel(filename, sheet_name='Info', header=None).dropna(axis = 0, how='all')
New_Insurances1 = pd.read_excel(filename, sheet_name='New', header=None)
Resignations1 = pd.read_excel(filename, sheet_name='Resignations', header=None)

    return(InsuranceInfo1, New_Insurances1, Resignations1)

This code works, but obviously returns just the hardcoded variables. What I'd like to have is something like this, where I can input a number so it'd create the variables with said number in name. My friend told me about a function in C called sprintf that helped him with similiar task, but neither of us had an idea how to format the code in Python

Thanks in advance for all suggestions!

def readfile(filename,i):

InsuranceInfo(i) = pd.read_excel(filename, sheet_name='Info', header=None).dropna(axis = 0, how='all')
New_Insurances(i) = pd.read_excel(filename, sheet_name='New', header=None)
Resignations(i) = pd.read_excel(filename, sheet_name='Resignations', header=None)

    return(InsuranceInfo(i), New_Insurances(i), Resignations(i))
Qbk
  • 23
  • 5
  • 2
    Whenever you find yourself naming variables with numeric suffixes, you should generally just put them in a list. Or since you're using pandas, make it another df. – Barmar Jun 18 '20 at 22:38
  • 3
    You don't want to create multiple named variables, you instead want to use a list or dictionary. – Samwise Jun 18 '20 at 22:38
  • 1
    I agree with @Samwise . My answer to this question: https://stackoverflow.com/questions/62290107/passing-float-to-a-nested-for-loop-and-storing-output/62290221#62290221 is another good explanation. – David Erickson Jun 18 '20 at 22:45

0 Answers0