I came across a memory management problem while using Spark's caching mechanism. I am currently utilizing Encoder
s with Kryo and was wondering if switching to beans would help me reduce the size of my cached dataset.
Basically, what are the pros and cons of using beans over Kryo serialization when working with Encoder
s? Are there any performance improvements? Is there a way to compress a cached Dataset
apart from caching with SER option?
For the record, I have found a similar topic that tackles the comparison between the two. However, it doesn't go into the details of this comparison.