6

The code below will generate the desired output in ONE dataframe, however, I would like to dynamically create data frames in a FOR loop then assign the shifted value to that data frame. Example, data frame df_lag_12 would only contain column1_t12 and column2_12. Any ideas would be greatly appreciated. I attempted to dynamically create 12 dataframes using the EXEC statement, google searching seems to state this is poor practice.

import pandas as pd
list1=list(range(0,20))
list2=list(range(19,-1,-1))
d={'column1':list(range(0,20)),
   'column2':list(range(19,-1,-1))}
df=pd.DataFrame(d)
df_lags=pd.DataFrame()
for col in df.columns:
    for i in range(12,0,-1):
        df_lags[col+'_t'+str(i)]=df[col].shift(i)
    df_lags[col]=df[col].values  
print(df_lags)
for df in (range(12,0,-1)):
    exec('model_data_lag_'+str(df)+'=pd.DataFrame()')

Desired output for dymanically created dataframe DF_LAGS_12:

var_list=['column1_t12','column2_t12']
df_lags_12=df_lags[var_list]  
print(df_lags_12)
Kyle
  • 387
  • 1
  • 5
  • 13

2 Answers2

13

I think the best is create dictionary of DataFrames:

d = {}
for i in range(12,0,-1):
    d['t' + str(i)] = df.shift(i).add_suffix('_t' + str(i))

If need specify columns first:

d = {}
cols = ['column1','column2']
for i in range(12,0,-1):
    d['t' + str(i)] = df[cols].shift(i).add_suffix('_t' + str(i))

dict comprehension solution:

d = {'t' + str(i): df.shift(i).add_suffix('_t' + str(i)) for i in range(12,0,-1)}

print (d['t10'])
    column1_t10  column2_t10
0           NaN          NaN
1           NaN          NaN
2           NaN          NaN
3           NaN          NaN
4           NaN          NaN
5           NaN          NaN
6           NaN          NaN
7           NaN          NaN
8           NaN          NaN
9           NaN          NaN
10          0.0         19.0
11          1.0         18.0
12          2.0         17.0
13          3.0         16.0
14          4.0         15.0
15          5.0         14.0
16          6.0         13.0
17          7.0         12.0
18          8.0         11.0
19          9.0         10.0

EDIT: Is it possible by globals, but much better is dictionary:

d = {}
cols = ['column1','column2']
for i in range(12,0,-1):
    globals()['df' + str(i)] =  df[cols].shift(i).add_suffix('_t' + str(i))

print (df10)
    column1_t10  column2_t10
0           NaN          NaN
1           NaN          NaN
2           NaN          NaN
3           NaN          NaN
4           NaN          NaN
5           NaN          NaN
6           NaN          NaN
7           NaN          NaN
8           NaN          NaN
9           NaN          NaN
10          0.0         19.0
11          1.0         18.0
12          2.0         17.0
13          3.0         16.0
14          4.0         15.0
15          5.0         14.0
16          6.0         13.0
17          7.0         12.0
18          8.0         11.0
19          9.0         10.0
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • This looks promising, would it be possible to dynamically assign these to dataframes 1-12 though? – Kyle Nov 04 '17 at 11:17
  • do you think `df1`, `df10` ? It is possible, but bad practice. – jezrael Nov 04 '17 at 11:18
  • Okay, now I see what you are doing, I didnt realize a dictionary could hold dataframes... So, I implemented your recommended version via: d_df={} for col in model_data.drop(['usrec'],axis=1).columns: for i in range(12,0,-1): d_df['t'+str(i)]=model_data[col].shift(i).add_suffix('_t'+str(i)).to_frame() However, it will only preserve the last column of the 2300 or so columns in the original dataframe, I know of the append function, however unsure if this would work here, any ideas? – Kyle Nov 04 '17 at 12:18
  • Disregard the above, I get the idea, you saved the columns to a list and shifted all of them. Very impressive solution. Many thanks!!! – Kyle Nov 04 '17 at 12:24
  • 1
    Hi jezrael, do you think you could assist with this question? if this is considered "taboo" sorry, I wont do that again. – Kyle Nov 09 '17 at 13:21
  • 1
    No, it is ideal way for ping someone ;) Give me some time, I can check it. – jezrael Nov 09 '17 at 13:23
2
for i in range(1, 16):
    text=f"Version{i}=pd.DataFrame()"
    exec(text)

A combination of exec and f"..." will help you do that. If you need iterating or Versions of same variable above statement will help

oguz ismail
  • 1
  • 16
  • 47
  • 69