0

I'm importing multiple dataframes and wrote the following process: 1. list of files to be coverted to dataframes + 2. list of names I want for the corresponding dataframes. 3. I combined the list into a dictionary:

tbls = ['tbl1', 'tbl2', 'tbl3']
dbname = ['dfABC', 'dfrand', 'dfXYZ']
dictdf = dict(zip(tbls, dbname)) 

Then I cycle through tbls to import the dataframes. (getdf below is a short function I wrote that reads the path, sheetname etc. for the excel/csv file in which the table(data) sits and imports the data.

for tbl in tbls:          
   dictdf[tbl] = getdf(tbl, dfRT, sfsession)

The process works except that the dataframes are written into the dictionary, i.e dfABC in the dictionary is replaced with a dataframe of 65K rows and 27 cols and so on.

What I want is dfABC = dataframe of 65krows and 27 cols. i.e in the above code. I tried:

str(dictdf[tbl]) = getdf(tbl, dfRT, sfsession)

but that gave an error. Is there a way to do this? thanks.

SModi
  • 125
  • 14
  • Why do you not want you DataFrames stored inside a dict? It is the far better choice. I do not know why you would want to assign individual variables when the DataFrames are already tired to a key in the dict. – It_is_Chris Apr 17 '20 at 20:39
  • Why is storing inside a dictionary a better choice. I want to work on these dataframes - i.e. merge, splice, add new cols etc. How do I do this if they sit inside a dictionary? Thanks – SModi Apr 17 '20 at 20:44
  • Just call the DataFrames from the dict: `dictdf['tbl1']` – It_is_Chris Apr 17 '20 at 20:47
  • and why is storing inside a dictionary a better choice? – SModi Apr 17 '20 at 20:48
  • In order to accomplish what you are trying to do you would have to implicitly create global variables...which is a bad idea. – It_is_Chris Apr 17 '20 at 20:51
  • 1
    Using a dictionary is better than dynamically creating variables because it will result in less complex code. As your amount of code (and your total amount of variables) grows, it would become increasingly difficult to keep track of some specific dynamically created variables. Keeping track of a single dictionary variable on the other hand is much easier. – Xukrao Apr 17 '20 at 21:12
  • So I gave up on using exec as shown below because it doesn't work within functions. (couldn't figure how to solve it with '.. in globals, locals') Am rewriting my code to use the dataframes within the dictionary. But how do I make sure the dataframes do not overwrite the original dictionary values, but adds to it as a third part? – SModi Apr 28 '20 at 21:16

1 Answers1

0

solved using exec and flipping the dictionary (the flip isn't needed to solve):

tbls = ['tbl1', 'tbl2', 'tbl3']
dfa = ['dfABC', 'dfrand', 'dfXYZ']
dictdf = dict(zip(dbname, tbls)) 
for df in dfs:
   tbl = dictdf[df]
   exec(f'{df} = getdf(\'{tbl}\', dfRT, sfsession)')

please note @Xukrao and @Yo_Chris's comments on keeping the dfs within the dictionary as a superior solution.

I found this question useful to understand how exec worked: What's the difference between eval, exec, and compile?

SModi
  • 125
  • 14
  • I've removed my solution as the accepted answer because exec stops working when it's within a function. adding globals and locals does not work. – SModi Apr 28 '20 at 20:41