-1

My searching was unable to find a solution for this one. I hope it is simple and just missed it.

I am trying to assign a dataframe variable based on a dictionary key. I want to loop through a dictionary of keys 0, 1, 2 3... and save the dataframe as df_0, df_1, df_2 ... I am able to get the key and values working and can assign one dataframe, but cannot find a way to assign new dataframes based on the keys.

I tried How to create a new dataframe with every iteration of for loop in Python but it didn't seem to work.

Here is what I tried:

docs_dict = {0: '2635_base', 1: '2635_tri'}

for keys, docs in docs_dict.items():
    print(keys, docs)
    df = pd.read_excel(Path(folder_loc[docs]) / file_name[docs], sheet_name=sheet_name[docs], skiprows=3)}

Output: 0 2635_base 1 2635_tri from the print statement, and %whos DataFrame > df as excepted.

What I would like to get is: df_0 and df_1 based on the excel files in other dictionaries which work fine.

    df[keys] = pd.read_excel(Path(folder_loc[docs]) / file_name[docs], sheet_name=sheet_name[docs], skiprows=3)

produces a ValueError: Wrong number of items passed 26, placement implies 1

SOLVED thanks to RubenB for pointing me to How do I create a variable number of variables? and answer by @rocky-li using globals()

for keys, docs in docs_dict.items():
    print(keys, docs)
    globals()['df_{}'.format(keys)] = pd.read_excel(...)}

>> Output: dataframes df_0, df_1, ...
  • why not use `dfs = {}` and then `dfs[key] = ...`? – hongsy Jan 28 '20 at 09:23
  • I placed `dfs = {}` before the loop and changed `df[keys]` to `dfs[keys]` and no error on run, but a dataframe isn't created. %whos DataFrame output is: No variables match your requested type. – Bradley Sawler Jan 28 '20 at 11:04

1 Answers1

0

You might want to try a dict comprehension as such (substitute pd.read_excel(...docs...) with whatever you need to read the dataframe from disc):

docs_dict = {0: '2635_base', 1: '2635_tri'}
dfs_dict = {k: pd.read_excel(...docs...) for k, docs in docs_dict.items()}
RubenB
  • 525
  • 3
  • 10
  • Thanks @RubenB I was able to get dict comprehension but I'm not sure how to get a dataframe df_0 = dfs_dict[0] and df_1 = dfs_dict[1] automatically without hardcoding it in. I want df_0 and df_1 and df_x depending on the case so that I can run merges and column checks. Basically I want a separate df for each excel file read in. – Bradley Sawler Jan 28 '20 at 11:18
  • You want to dynamically create variables? That is generally bad practice and not necessary for merging and column checks. Have a look at https://stackoverflow.com/questions/1373164/how-do-i-create-a-variable-number-of-variables as to why you should not do this. You would be better off looping over the iterable with dataframes and do whatever you want to them individually. Wrt the merging, check out https://stackoverflow.com/questions/38089010/merge-a-list-of-pandas-dataframes to see how you would do this with an iterable. – RubenB Jan 28 '20 at 22:13
  • Bingo. The answer was in the [1373164](https://stackoverflow.com/questions/1373164/how-do-i-create-a-variable-number-of-variables) by RockyLi using globals()['df_{}'.format(keys)] = ... . I am unable to upvote your answer RubenB as I don't have enough reputation. Hopefully, someone can on my behalf. – Bradley Sawler Jan 29 '20 at 20:46