Hold Spark dataframe in dictionary

Asked Aug 18 '22 at 04:50

Active Aug 18 '22 at 04:50

Viewed 203 times

Wondering if anyone sees a problem with creating a dictionary with dataframes in PySpark application, like described below. Would this cause any memory issues? Many thanks.

def create_a_dict_with_df ():
    dict_result = dict()
    dict_result['key1'] = df1 #as an example, assign df to a value 
    dict_result['key2'] = df2 #as an example, assign df to aother value
    dict_result['key3'] = df3

    return dict_result #this is a dict with dataframes as value

asked Aug 18 '22 at 04:50

user3735871

I think it is just only a point saved to the dictionary. And when you execute an action in the dataframe, it will calculate. – Junhua.xie Aug 18 '22 at 06:10
i do this for loops that generate 100s of dataframes. the keys act as the dataframe name. see [this](https://stackoverflow.com/q/72983940/8279585) for an example – samkart Aug 18 '22 at 08:00

Hold Spark dataframe in dictionary

0 Answers0