-2

I have a number of files that I want to look at and I would like to label each one based on a set of variable, e.g.,...

location = 'home'
source_system = 'Sys_'
date = '20160608'
file_name = location+source_system+date
print(file_name)

-> homeSys_20160608

...based on the file_name I want to use that to label the dataframe, so something like...

file_name = sqlContext.read.parquet(file_path)

I have the file_path defined elsewhere.

What happens when I try this is the dataframe is now named file_path not homeSys_20160608 which is what the variable originally was.

Is there any way to do what I want?

Basically I am wondering if there is a way to create a data frame based on variables. I want to create multiple data frames each with a different name so I can analyze them in one python notebook. Each data frame would have the same structure but different data populated.

antimuon
  • 252
  • 2
  • 12
  • 1
    This is a bad idea - you should [keep data out of your variable names](http://nedbatchelder.com/blog/201112/keep_data_out_of_your_variable_names.html). Use a dictionary instead. – MattDMo Jun 20 '16 at 13:40

1 Answers1

0

You are joining 'location+source_system+date' without any filesystem delimiters.

your string will look like this: homeSys_20160608

You want it to be this, I believe: home/Sys_20160608

You can either manually put one in like so:

file_name = location + '/' + source_system + date

Or use the os module:

file_name = location + os.path.sep + source_system + date

in order for it to work on windows and Linux.

Will
  • 4,299
  • 5
  • 32
  • 50
  • 2
    Using `os.path.join` is an option as well, and IMHO clearer – Scott Hunter Jun 20 '16 at 13:32
  • True, but I since two of the strings need to be joined directly and then `os.join`ed to the third I though this would be cleaner than `filename = os.path.join(location, source_system + date )` – Will Jun 20 '16 at 13:35