Me using spark-sql for data migration project. So how should I implement stage area in spark ? when to use spark sql cache or persists? any real time use cases ?
~Sha
Me using spark-sql for data migration project. So how should I implement stage area in spark ? when to use spark sql cache or persists? any real time use cases ?
~Sha
Similarly to RDD
(What is the difference between cache and persist?) the only difference between cache
and persist
is ability to set non-default storage mode.
There is one important difference though. Unlike in RDD
API, where cache
uses MEMORY_ONLY
, Dataset
counterpart uses MEMORY_AND_DISK
.