1

Wondering if anyone sees a problem with creating a dictionary with dataframes in PySpark application, like described below. Would this cause any memory issues? Many thanks.

def create_a_dict_with_df ():
    dict_result = dict()
    dict_result['key1'] = df1 #as an example, assign df to a value 
    dict_result['key2'] = df2 #as an example, assign df to aother value
    dict_result['key3'] = df3

    return dict_result #this is a dict with dataframes as value
user3735871
  • 527
  • 2
  • 14
  • 31
  • I think it is just only a point saved to the dictionary. And when you execute an action in the dataframe, it will calculate. – Junhua.xie Aug 18 '22 at 06:10
  • i do this for loops that generate 100s of dataframes. the keys act as the dataframe name. see [this](https://stackoverflow.com/q/72983940/8279585) for an example – samkart Aug 18 '22 at 08:00

0 Answers0