0

Here is a DateFrame,like this:

    df_12=df[df.index.year==2012]

I want to get a series df_X ,X='13','14','15',like bellow

    df_13,df_14...

In fact,I can get it directly,by doing:

   df_13=df[df.index.year==2013]
   df_14=df[df.index.year==2014]
   df_15=df[df.index.year==2015]
   df_16=df[df.index.year==2016]

But ,it will get crazy when the X get very large.So,I try to use a for loop,

  for x in ['13','14','15','16']:
      df_+x=df[df.index.year==int('20'+x)]

It raises a error,and I know why I was wrong. Could anybody get it by using a loop? Thanks!

xiao_dong
  • 19
  • 3
  • Are you sure it's `==` and not `=`? I'm not familiar with pandas dataframes. Also, I'm surprised that `df_+x=...` works, as that seems to be assigning to an expression rather than to a reference. Again, I'm not familiar with pandas. – TigerhawkT3 May 05 '16 at 07:33
  • Please post your error message, otherwise we can only guess ;) That said I am guessing that @TigerhawkT3 is right, but I am also not familiar with pandas. – m00am May 05 '16 at 07:36
  • 1
    Actually, going from `df_13`, `df_14`... to `df_+x` looks like an attempt at [variable variables](http://stackoverflow.com/questions/1373164/how-do-i-do-variable-variables-in-python). – TigerhawkT3 May 05 '16 at 07:37
  • You should read this [great post](http://nedbatchelder.com/blog/201112/keep_data_out_of_your_variable_names.html) by Ned Batchelder. The gist of it is that if writing them all out is going to get overly cumbersome when using a large `X`, then working with all those variables will also get really cumbersome. You should make a list of dataframes instead. – mgilson May 05 '16 at 07:53

2 Answers2

1

I am not an expert, but I think you coud use dictionaries, and do something like this:

import numpy as np
x=range(10)

name=[0]*len(x)
for i, number in enumerate(x):
    name[i]='df_{0}'.format(x[i])

year=range(2013,2023)

data=dict(zip(name,year))

So, you can recall youd data:

data['df_0']
Out[45]: 2013

Actually, I don't know if you can use it with DataFrames...

0

Your problem is that you can't have a variable variable name: df_+x on the left-hand side of an assignment.

@mgilson is right, you should really create a list or dict of date frames. If repeated references to df[df.index.year==2012] is too cumbersome,

df_ = {}
for x in ['13','14','15','16']:
      df_[x]=df[df.index.year==int('20'+x)]

# usage
df_['13'] 

Integer keys would be less bother to type than string ones, i.e. use for x in [13,14,15,16] and df_[13]

If you are really absolutely determined to do something horrible, there is locals() which gives you access to the local namespace as a dict.

 locals()["df_"+x] = df[df.index.year==int('20'+x)]

My advice: just don't!

nigel222
  • 7,582
  • 1
  • 14
  • 22