With spark 2.0.2, doing ...
val parent: DataFrame = ...
parent.persist()
parent.count
val child: DataFrame = parent.filter(...)
child.persist()
child.count
parent.unpersist()
... does not unpersist the child
dataframe. However, it does with spark 2.2.0 and spark 2.3.0 (maybe it does too with 2.1 ? I did not try) !
Is there an alternative to reproduce the persist
from spark 2.0.2 but with newer versions of spark ? I've tried checkpoint
but performances were not great.