Is there a defined standard for effective memory management in Spark
What if I end up creating a couple of DataFrames or RDDs and then keep on reducing that data with joins and aggregations??
Will these DataFrames or RDDs will still be holding resources until the session or job is complete??