1

I'm trying to load data into datasets in Python. My data is arranged by years. I want to assign variable names in a loop. Here's what it should look like in pseudocode:

import pandas as pd
for i in range(2010,2017):
    Data+i = pd.read_csv("Data_from_" +str(i) + ".csv")
    # Stores data from file "Data_from_YYYY.csv" as dataset DataYYYY.

The resulting datasets would be Data2010 - Data2017.

0x51ba
  • 463
  • 3
  • 12
Marcel B
  • 23
  • 4
  • You can't. Well, at least, you definitely shouldn't (using `exec` is dangerous). What's wrong with a dictionary `data = {}; for i in range(2010, 2017): data[i] = pd.read_csv("Data_from_" +str(i) + ".csv")`? – FHTMitchell Mar 07 '18 at 19:18
  • There are canonical ways which aren't risky the way that `exec` is, see my answer. – jooks Mar 07 '18 at 19:26
  • Thank you very much, using a dictionary works perfectly. – Marcel B Mar 07 '18 at 19:32

2 Answers2

3

While this is possible, it is not a good idea. Code with dynamically-generated variable names is difficult to read and maintain. That said, you could do it using the exec function, which executes code from a string. This would allow you to dynamically construct your variables names using string concatenation.

However, you really should use a dictionary instead. This gives you an object with dynamically-named keys, which is much more suited to your purposes. For example:

import pandas as pd
Data = {}
for i in range(2010,2017):
    Data[i] = pd.read_csv("Data_from_" +str(i) + ".csv")
    # Stores data from file "Data_from_YYYY.csv" as dataset DataYYYY.

# Access data like this:
Data[2011]

You should also use snake_case for variable names in Python, so Data should be data.

If you really wanted to dynamically generate variables, you could do it like this. (But you aren't going to, right?)

import pandas as pd
for i in range(2010,2017):
    exec("Data{} = pd.read_csv(\"Data_from_\" +str(i) + \".csv\")".format(i))
    # Stores data from file "Data_from_YYYY.csv" as dataset DataYYYY.

You can do this without exec too; have a look at Jooks' answer.

Aaron Christiansen
  • 11,584
  • 5
  • 52
  • 78
  • 2
    *Code with dynamically-generated variable names is difficult to read and maintain.* This can't be said loudly enough. – Drise Mar 07 '18 at 19:24
  • To wit: python uses `snake_case`, not `camelCase`. – jooks Mar 07 '18 at 19:25
  • @jooks Whoops, I'll change my answer. Been doing much more C# lately! – Aaron Christiansen Mar 07 '18 at 19:25
  • @AaronChristiansen [Relevant PEP 8 section regarding use of snake_case](https://stackoverflow.com/questions/159720/what-is-the-naming-convention-in-python-for-variable-and-function-names) – Drise Mar 07 '18 at 19:26
  • Haven't seen `exec()` before. Cool. Also, code with dynamically-generated variables sounds like a footgun. – Quentin Mar 07 '18 at 19:59
1

You can use globals, locals or a __init__ which uses setattr on self to assign the variables to an instance.

In [1]: globals()['data' + 'a'] = 'n'

In [2]: print dataa
n

In [3]: locals()['data' + 'b'] = 'n'

In [4]: print datab
n

In [5]: class Data(object):
   ...:     def __init__(self, **kwargs):
   ...:         for k, v in kwargs.items():
   ...:             setattr(self, k, v)
   ...:

In [6]: my_data = Data(a=1, b=2)

In [7]: my_data.a
Out[7]: 1

I would probably go the third route. You may be approaching your solution in an unconventional way, as this pattern is not very familiar, even if it is possible.

jooks
  • 1,247
  • 10
  • 17