1

A lot of the discussions I found on the internet on resource allocation was about the max memory config for --executor-memory, taking into account a few memory overheads.

But I would imagine that for simple job like reading in a 100MB file and then count # of rows, with a cluster of a total 500GB memory available across nodes, I shouldn't ask for # of executors and memory allocation that, with all memory overheads accounted for, could take all 500GB memory, right? Even 1 executor of 3GB or 5GB memory seems to be an overkill. How should I think about the right memory size for a job?

Thank you!

Supergan
  • 11
  • 1
  • A web search for "apache spark sizing memory" will get a lot. SO is really about specific coding questions, so this might not be quite on topic. –  Feb 22 '19 at 21:26
  • Unfortunately, SO isn't a place for discussing these kind of matters like note in the comment above. To briefly give some pointers, I'd say that for that amount of data, spark is actually an overkill. If you want to tune your cluster memory (driver/executor), you ought have knowledge about what is the amount of data to work with, some monitoring would also be needed. This post is a bit old but it would be a good starting point https://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-1/ – eliasah Feb 22 '19 at 22:26
  • 1
    Thanks for the feedback guys...the 100MB is just an example, maybe exaggerated, to illustrate a situation where I have a lot more RAM available on cluster than the job's data size. – Supergan Feb 22 '19 at 22:40
  • 1
    Also, I've seen many well discussed posts on SO not really about specific coding questions so not sure mine should be closed. e.g. https://stackoverflow.com/questions/20301661/what-will-spark-do-if-i-dont-have-enough-memory – Supergan Feb 22 '19 at 22:41
  • In your case, the answer isn't that short, obvious or even fits within a single page. That question isn't actually a discussion, it's just factitious. Ppl made conferences about tuning spark jobs, if you get my point. – eliasah Feb 22 '19 at 22:49
  • 1
    That's fair...let me see if I can be more specific about the question. Thanks! – Supergan Feb 22 '19 at 22:56

0 Answers0