0

I would like to create dataframes that have the same name as the variable name. I created the following function:

def path_to_df(path):
   filename=str(path).split('/')[-1]
   allFiles = glob.glob(path + "/*.csv.gz")
   list_=[]
   for file_ in allFiles:
       df = pd.read_csv(file_,index_col=None, header=0)
       list_.append(df)
   frame = pd.concat(list_, axis = 0, ignore_index = True)
   frame.columns = [str(filename)+ '_' +str(col) for col in frame.columns]
   exec('{}_df=frame'.format(filename))
   print('Completed:  {}_df'.format(filename))

Each step of the function works, except for the following step:

 exec('{}_df=frame'.format(filename))

There are no errors when I run the code. The function returns 'frame' but it does not return the custom dataframe (i.e. {}_df)

sos.cott
  • 435
  • 3
  • 17

1 Answers1

1

I'm not sure why you don't want to just use a dictionary for this, but for the sake of just answering the question you can modify your current module to set these variables using sys and setattr like so

import sys

# Now instead of exec'ing
setattr(sys.modules[__name__], '{}_df'.format(filename), frame)

Again though, you almost definitely want to use a dict instead. See this question: How can you dynamically create variables via a while loop?

Nick Chapman
  • 4,402
  • 1
  • 27
  • 41
  • This works! Thanks. Will click on the green tick in 4 mins. :D Why would you recommend using a dictionary over your current answer? – sos.cott Dec 19 '18 at 14:44
  • 2
    Read through the link that I shared. This is exactly the situation that dictionaries are made for. Furthermore, the variables in a module are actually just stored in a dictionary within the module (this is a simplification but good enough for now) so you're actually just coming up with a more complicated strategy for using a dictionary that you didn't declare. – Nick Chapman Dec 19 '18 at 14:51