0

I'm building a Spark application that is working with very large matrices which require many GB per matrix. I am running Spark on a single AWS instance using the following invocation:

spark-submit --driver-memory 20g --executor-memory 20g --class "mycoordinates.App" --master local[7] my.jar args

Once my problem set reaches a certain size threshold I start getting OOM errors, and increasing the driver memory does not resolve the problem (my understanding is that with master=local the executor memory setting does not matter ... I increased it any way just to be sure and it had no effect either).

Inside my application I put the following statements to check the heap (programming in scala):

println(" mem avail: " + java.lang.Runtime.getRuntime().totalMemory() )
println(" max mem: " + java.lang.Runtime.getRuntime().maxMemory() )
println(" free mem: " + java.lang.Runtime.getRuntime().freeMemory() )

Following is a typical output:

Driver memory set to 20G

mem avail: 2075918336
max mem: 21099708416
free mem: 1720845616

Driver memory set to 8G

mem avail: 2075918336
max mem: 8303607808
free mem: 1720720376

Driver memory set to 3G

mem avail: 2075918336
max mem: 3113877504
free mem: 1720850720

Driver memory set to 1G

mem avail: 1037959168
max mem: 1037959168
free mem: 1000899656

The maxMemory value seems to track the driver memory setting from the command line as expected. However, the totalMemory and freeMemory do not. If I reduce the driver memory setting below 2G then totalMemory and freeMemory will reduce as expected. But for driver memory settings over 2G they do not change. The totalMemory value is always exactly the same. The freeMemory value changes a little bit, but seems like just random variation. The value is always within a small range.

I also printed the spark config settings from within the app, and they all seem to agree with the command line settings.

All the info I have found so far indicates --driver-memory will increase the heap size. But it seems like there is some other parameter setting that limits the heap size to 2GB.

Can anyone tell me what else must be configured in order for the heap to grow beyond 2GB?

Tim Ryan
  • 1,010
  • 2
  • 11
  • 19
  • After more searching I was able to find that I can pass an argument through to the jvm: --driver-java-options -Xms8g. Now I'm wondering if there is a way to make the jvm default to use the max memory for the initial allocation so that I don't have to keep these 2 different arguments coordinated. – Tim Ryan Jun 01 '17 at 02:46

1 Answers1

0

Java allocates memory lazily. It won't allocate more memory than it currently needs unless you specify this on startup. In this case it needs 2G, but it, if I read it correctly, states that it can go up to the GB value you specified.

Write up on SO: What are Runtime.getRuntime().totalMemory() and freeMemory()?

elmalto
  • 1,008
  • 1
  • 14
  • 23
  • Yes, that is what I was expecting, but I had a case where I was getting an out of memory error even though the max heap was large enough. Specifying the initial heap to use the max heap seemed to fix the problem. However, now I'm trying to reproduce that example and I'm unable to. So maybe something else changed and I didn't realize it. – Tim Ryan Jun 04 '17 at 22:40