0

I have eight Pyspark dataframes with names such as "store", "inventory", and "storage".

I need to create views for each one but in order to streamline, instead of saying

store.createOrReplaceTempView('store_view') etc.

Is it possible to iterate through a list of dataframes and create a view? For example:

df_list = ["store", "inventory", "storage"]

for d in df_list:
    x = "convert dataframe d to a string"
    d.createOrReplaceTempView(x)

How can I assign the dataframe name as a string to x?

I suppose you could do the opposite - have a list of strings but then how to get the dataframe from that?

Chuck
  • 1,061
  • 1
  • 20
  • 45

2 Answers2

2

You can utilise dictionaries for this purpose. Wherein the key would represent the dataframe name and value will be equivalent to the dataframe itself

Example

in_hash = {
  'store':store,
  'inventory':inventory,
  'storage':storage
}

for name in in_hash:
    in_hash[name].createOrReplaceTempView(name)

Vaebhav
  • 4,672
  • 1
  • 13
  • 33
1

To create a temporary view of a PySpark DataFrame you can exploit the globals()[] function to dynamically retrieve the corresponding DataFrame object from the global symbol table, searching by name.

df_list = ["store", "inventory", "storage"]

for d in df_list:
    df = globals()[d]
    df.createOrReplaceTempView(d)
Myron_Ben4
  • 432
  • 4
  • 13