0

Me using spark-sql for data migration project. So how should I implement stage area in spark ? when to use spark sql cache or persists? any real time use cases ?

~Sha

BdEngineer
  • 2,929
  • 4
  • 49
  • 85

1 Answers1

-1

Similarly to RDD (What is the difference between cache and persist?) the only difference between cache and persist is ability to set non-default storage mode.

There is one important difference though. Unlike in RDD API, where cache uses MEMORY_ONLY, Dataset counterpart uses MEMORY_AND_DISK.