0

In an old interpreted language used by astrophysicists a few years ago for data processing, SuperMongo, which has been superseded by Python now, it was possible to do things like that (for clarity, I write pseudo code, not real SuperMongo):

mylist = [bar,foo]
for df in mylist:
    bar = pd.read_csv(df+".csv")
    for i = 1 to n:
        bar+"_"+i = bar * i

You've grab the idea: you could create an instance (for example, a Pandas DataFrame) from the concatenation of the name of others in a loop, or to build a string indicating a file name, for example. I work with many DataFrame and my code would be far more readable, mistake-proof, and easy to write, if I could loop on them. I sometimes use some things like lists of lists:

my_list = [[df1, df1_something, df1_another_thing,"df1.csv"],
           [df2, df2_something, df2_another_thing,"df2.csv"],
           [dfothername, dfothername_something, dfothername_another_thing,"df1.csv"],
           ...
          ]
for df in my_list:
    df[2]=some_function(df[1])
    ...

But it's still not perfect.

Is there a way to achieve what I've showed in pseudo-code in python ?

Edit: The question is somehow a duplicate. To be short, you can do something like this:

dic = {}
for name in namelist:
    dic[name]=pd.read_csv(name+".csv")
dic['othername']=pd.DataFrame(...)
for name in dic.keys(): 
    dic[name+"_processed"]=process(dic[name])

Or, for the last line, depending on what is the most convenient in context:

dic_processed = {}
for name in dic.keys(): 
    dic_processed[name]=process(dic[name])
Matt
  • 763
  • 1
  • 7
  • 25
  • 1
    Eww, dynamic variables are rarely a good idea. Just use a dictionary, or a list. – Carcigenicate Dec 06 '17 at 13:14
  • 1
    Possible duplicate of [How can you dynamically create variables via a while loop?](https://stackoverflow.com/questions/5036700/how-can-you-dynamically-create-variables-via-a-while-loop) – Carcigenicate Dec 06 '17 at 13:15
  • I've read the question, but still don't understand how you build a DataFrame named name(df) + "_bar" from df in a loop – Matt Dec 06 '17 at 13:24
  • 1
    You'd create a dictionary, and do `dic[name(df) + "_bar"] = df`. – Carcigenicate Dec 06 '17 at 13:38
  • OK, I'm gonna try that, thanks. – Matt Dec 06 '17 at 13:57
  • Carcigenicate: This doesn't work, please have a look at my edit. – Matt Dec 07 '17 at 13:09
  • Why are you trying to use a data frame as a key? And the first way you wrote in the edit does work. Do you understand how to use dictionaries? – Carcigenicate Dec 07 '17 at 13:11
  • OK, first things first: how do you perform `name(df)` in your example? – Matt Dec 07 '17 at 13:14
  • If you have the name of the data frame externally, you do `dic[df_name]`. No, you can't query the name of the data frame every time since its a value and requires a key to access it. This would be the same limitation that you'd run into with SuperMongo style dynamic variables, since you wouldn't be able to get the name without already having the name. The problem isn't the dictionary solution, it's your reliance on `name(df)` as the key to dictionary. Just make the key based on something other than the data frame. – Carcigenicate Dec 07 '17 at 13:18
  • You're right, I guess there are lots of things I don't understand, otherwise I wouldn't be asking questions here. Could you just put an answer on how to do `for df in mylist: df = pd.read_csv(df+".csv") ; df+"_square"=df**2`? I'll be more than happy to approve it. – Matt Dec 07 '17 at 13:33
  • I'm just starting work now, so I can't write out a full answer. A few things I'll say though: instead of using the underscore naming ("df_square"), I'd just call the dictionary itself "squared_df" or something. Then you'd do `squared_df[your_key]`. Giving every key a "_square" suffix just make it harder to use, since then you have to concatenate the key together every time you want to access a value. That convention may make sense in SuperMongo, but that's likely since their dynamic variables create globals, and having a underscore suffix prevents name collisions. – Carcigenicate Dec 07 '17 at 13:42
  • If you tuck everything into a dictionary however, name collisions aren't a problem (which is why they should be preferred over using dynamic variables). For your `df**2` example, you're still trying to use the `df` as part of the key. If `df**2` doesn't modify `df`, and you have continued access to `df`, then you can just do `squared[name(df)] = df**2`. Using the `df` name is only a problem if you're naming the key based on the value directly. Are you sure you need to do this approach at all though? – Carcigenicate Dec 07 '17 at 13:43
  • Do you need to have a mapping between `df` and `df**2`? Is it not enough just to stick all the `df**2`s in a list? – Carcigenicate Dec 07 '17 at 13:48
  • No, I don't need `df**2`, it was just the simplest example I've found. Let say, for instance, that for each df on a list, I want to create `df_foo = pd.read_csv(name(df)+"_foo.csv")`, then process it and merge by `df = merge(df,df_foo,on='some_field')`. How do you do that? – Matt Dec 07 '17 at 15:59
  • OK mate, I've understood, it works perfectly, cheers! In my bad exemple, it would write `for name in dic.keys(): dic[name+"_square"]=dic[name]**2` Thank you very much!!! :) – Matt Dec 07 '17 at 19:45
  • Good, glad you got it working. Sorry, I spent my break helping someone else, so I haven't had a chance to check back. If you got dictionaries working as a solution, you should accept the duplicate to help people in the future find that post. – Carcigenicate Dec 07 '17 at 21:46
  • Note, I still recommend not adding the "_square" suffix. It's not needed and just bloats the key name, and the assignments that they're used in. If you call the dictionary `squared`, that information is implied. – Carcigenicate Dec 07 '17 at 22:21
  • I don't know how to accept the duplicate once it has been refused... – Matt Dec 08 '17 at 20:11
  • Mm, I can't re-suggest it. Oh well. It's a common enough question people should still be able to find it. I just though that tying a SuperMongo post to it would be beneficial. – Carcigenicate Dec 08 '17 at 21:07
  • You might be able to request it. I can't remember how much rep that takes. – Carcigenicate Dec 08 '17 at 21:07
  • OK, I've found how. I've also edited to give a short answer to whoever could come from SuperMongo (although it's rather unlikely ! :) ). Cheer mate. – Matt Dec 10 '17 at 11:34

0 Answers0