I am looking to retrieve the name of an instance of DataFrame, that I pass as an argument to my function, to be able to use this name in the execution of the function. Example in a script:
display(df_on_step_42)
I would like to retrieve the string "df_on_step_42" to use in the execution of the display function (that display the content of the DataFrame).
As a last resort, I can pass as argument of DataFrame and its name:
display(df_on_step_42, "df_on_step_42")
But I would prefer to do without this second argument.
PySpark DataFrames are non-transformable, so in our data pipeline, we cannot systematically put a name attribute to all the new DataFrames that come from other DataFrames.