Say I have three RDD transformation function called on rdd1:
def rdd2 = rdd1.f1
def rdd3 = rdd2.f2
def rdd4 = rdd3.f3
Now I want to cache rdd4
, so I call rdd4.cache()
.
My question:
Will only the result from the action on rdd4
be cached or will every RDD above rdd4
be cached? Say I want to cache both rdd3
and rdd4
, do I need to cache them separately?