Extracting dataframes from a dictionary (containing many dataframes)

Question

I have a directory containing many .dta (Stata format) files which I have loaded into a dictionary of dataframes.

Given I now may have many dataframes within the unified dictionary object, how can I select a few and "extract" them out of the dictionary?

So for example, how would I extract just two of these three dataframes nested in the dictionary, something like

dat10 = df_dict['dat10']
dat11 = df_dict['dat11']

but without using literal assignments. Maybe some sort of looping structure over the dictionary.

Welcome to stack overflow. It's not entirely clear what your expected output is, are you just asking ow to iterate over a dictionary? — G. Anderson, Sep 13 '21 at 20:12
Does this answer your question? [Iterate through dictionary values?](https://stackoverflow.com/questions/30446449/iterate-through-dictionary-values) — G. Anderson, Sep 13 '21 at 20:12
Hi @G.Anderson thank you for responding. My expected output is standalone dataframes i.e., dat10, dat11,... so that I can use them later. So I already have these dataframes in the dictionary and I was wondering if I can extract them from the dictionary. For better context, these were all .dta files and I have imported them in Python using a dictionary. Let me know if that helps. Thanks again! — idg23, Sep 13 '21 at 20:23
I think I understand a little better now. As you can see in [How can I create variable variables](https://stackoverflow.com/questions/1373164/how-do-i-create-variable-variables) the generally accepted best practice is, in fact, to store them in a dictionary, since python doesn't innately allow you to create variables dynamically. What you have now is a perfectly good way of storing them, which can be accessed and operated on via key:value pairs. What is it you've tried and found that you _can't_ do with the structure the way it is now? — G. Anderson, Sep 13 '21 at 20:45
Thanks again! So if I have access to the individual dataframes, then I can do some further data cleaning using a for loop. Let's assume (for the sake of simplicity) I want to a create in all of them a new variable and drop some other variables. If I have these dataframes, then I can simply run a for loop like: for df in (df9, df10, df11): df['new_var'] = 1 / df['old_var'] + 1 df = df.drop(['va1', 'var2', axis = 1, inplace=True) — idg23, Sep 13 '21 at 20:51
I would counter that by saying you could just as easily say `for df in (df9, df10, df11): df_dict[df['new_var']] = 1 / df_dict[df['old_var']] + 1 df_dict[df] = df_dict[df].drop(['va1', 'var2', axis = 1, inplace=True)` — G. Anderson, Sep 13 '21 at 21:01
Thanks again! Unfortunately if I do as you suggested, I get an error: TypeError: unhashable type: 'Series' — idg23, Sep 13 '21 at 21:28

Extracting dataframes from a dictionary (containing many dataframes)

0 Answers0