I need to repeat similar operations in data frames with identycal structures but refering to different years; The objective is to generate a function (the one included is just for the sake of the presentation of the problem) that calls the data using one of its arguments (the year in this case). I would like to be able to select the data frame within the function using its name, in this case using the last part of its name (its year) as an argument of the function'
import pandas as pd
import numpy as np
Suppose you have three data frames, one for each year
df_2005 = pd.DataFrame(np.random.randint(0,10,size=(10, 4)), columns=list('ABCD'))
df_2006 = pd.DataFrame(np.random.randint(0,10,size=(10, 4)), columns=list('ABCD'))
df_2007 = pd.DataFrame(np.random.randint(0,10,size=(10, 4)), columns=list('ABCD'))
This functions extracts, as an example, some data from the data frame to generate a different variable
def func_1 (Year):
data=data_+year
X=data.iloc[[1,2],[2,3]].copy()
return X
This is the way I intend to use the function:
subdata_2005=func_1('2005')
I have tried several things like data_+year, data_`year', but nothing seems to work. I was not able to find answers to similar questions that could help me in this case. Any suggestion would be highly appreciated