Dictionary of Dataframes from existing Data frames

Question

I have the following code for creating a dictionary of data frames using csv files:

l = ['employees','positions']
d = {}
for x in l:
    d[x] = pd.read_csv("P:\\python_work\\data_sets\\" + x + ".csv")

How would I do the same using a list of data frames that already exist in memory?

This doesn't work but maybe it helps clarify what I'm trying to do:

l = ['df1','df2']
d = {}
for x in l:
    d[x] = x

I would then be able to access individual data frames like so:

d['df1']

I provided the example using csv files because it works and it has the same end result (a dictionary of data frames).

Here's an example of the desired contents of the dictionary:

{'employees':    id   name      date
 0   1    bob  1/1/2018
 1   2  sally  1/2/2018, 'positions':      pos      desc status
 0  12454  director      a
 1  65444   manager      i}

I want to use a list of existing data frames rather than csv files. I tried using a list without quotes but I get an error:

l = [employees, positions]
d = {}
for x in l:
    d[x] = x

...but I get this error:

TypeError: 'DataFrame' objects are mutable, thus they cannot be hashed

If `df1` and `df2` are variables in the global namespace, you can access them by using `globals()['df1']`, etc. Although better practice would be to store them in the dictionary when you load them in. — user3483203, Jul 03 '18 at 14:47
`l = ['df1','df2']` is a list of strings. you need a list of dataframes. — Mohammad Athar, Jul 03 '18 at 14:48
I think a solution like `d[x] = eval(x)` would work, but I'm not too familiar with `eval`. — Tomas Farias, Jul 03 '18 at 14:48
if I use a list of data frames such as l = [df1, df2] then I get an error "Dataframe objects are mutable, thus they cannot be hashed" — Dread, Jul 03 '18 at 15:03

jpp · Answer 1 · 2018-07-03T14:53:40.943

1

The problem is you're defining a list of strings and building a dictionary mapping each string to itself. Much simpler is to use enumerate with an iterable of dataframes. Assuming df1 and df2 are dataframes:

d = dict(enumerate((df1, df2), 1))

Then access your dataframes via d[1] and d[2]. If you really want your keys to be strings "df1" and "df2", you can use a dictionary comprehension:

d = {'df'+str(i): j for i, j in enumerate((df1, df2), 1)}

A better naming convention, in my opinion, is to use your filenames as keys:

files = ['employees', 'positions']
d = {f: pd.read_csv(f'P:\\python_work\\data_sets\\{f}.csv') for f in files}

edited Jul 03 '18 at 14:53

answered Jul 03 '18 at 14:48

jpp

159,742
34
281
339

I want to use the option similar to where you use filenames as keys, but I want to use the data frame names as keys (i.e. employees and positions are existing data frames rather than csv files). – Dread Jul 03 '18 at 15:05
@Dread, So the example in your question isn't really accurate? Here are the things you should **not** do: use `eval`, use `globals`, use `locals`. What you *can* do is read dataframes straight into your dictionary, e.g. `d = {}; d['employees'] = pd.read_csv(...)`. – jpp Jul 03 '18 at 15:06

score 0 · Answer 2 · answered Jul 03 '18 at 14:52

You are almost there, I added k to show how you should use enumerate in this case

l = ['employees','positions']
k = [1,2]
d = {}
for index,x in enumerate(l):
    d[x] = k[index]

Returns for d:

{'employees': 1, 'positions': 2}

Than excess you dataframe by:

df_1 = d.get('employees')

(ofcourse you have to replace k[index] with the creation of your dataframe)

score -1 · Answer 3 · answered Jul 03 '18 at 14:49

-1

There is already a dictionary with all the declared variables in memory available via the locals() or the globals() builtin functions, depending whether the dataframes are defined as local or global variables. You should be able to access your DataFrame as such:

locals()['df1']

answered Jul 03 '18 at 14:49

Rob

3,418
1
19
27

1

In my opinion, using `globals()` for this purpose is not recommended, see https://stackoverflow.com/questions/1373164/how-do-i-create-a-variable-number-of-variables – jpp Jul 03 '18 at 14:55

Dictionary of Dataframes from existing Data frames

3 Answers3