After I followed How to allocate more executors per worker in Standalone cluster mode?, Yes I understood that We can allocated more(N) no. of executors per Worker by setting SPARK_WORKER_INSTANCES = N (in our spark-env.sh). But does that(N) have any limitation? Actually by which factor(sys configuration) I can decide that "I can allocate a max N no. of executor per Worker". (Till now I assume the no. of cores I have in a system say on worker node(12 in my case) then I can assign a max of 12 executors per Workers).
Also I set the following on a standalone cluster of 1 master (64 Gb) and 3 workerNodes(each 64 Gb 12 Core System)
SPARK_WORKER_INSTANCES = 10 ###can i set it to > 12(total cores)?
SPARK_WORKER_CORES = 15 ###will it thorows an error (As max I have 12 core per worker machine)???
SPARK_WORKER_MEMORY = 60g
if by above setting 10 Executors will start per Worker, so I will have 10*3 = 30 executors to my application. Can I set it to >12 like 13 or 15?
if my 2nd setup SPARK_WORKER_CORES = 15 throws error(please correct me at this)? Then suppose I reset it to max 12 cores, So how those 10 Executors per Worker divide that 12 cores(Max available)among them, 12/10 ?? Or cores never divided among Executors. How that Works??
From 3rd set up, SPARK_WORKER_MEMORY = 60g, How memory is allocated to my executors,, 60/10 = 6 ???
And If I am not wrong, executors helped in caching the data, If from my 3rd set up above I got a total(max) of 60 gb(10*6), and then If I set spark.memory.fraction 0.7 in default conf, then how much memory I will have to persist my data. 3 Questions-
1.What is the max amount of data I can cache(assume 6 Gb file will take 6Gb size if cached). Is it 18Gb in total and each executor will have 1.8 Gb data cached with them???
2.And If I tried caching 30 Gb data with MEMORY_AND_DISK storagelabel, will exactly 18 Gb will be cached and extra 12 Gb will be written to Disc??
3.In which way cached size is calculated(from total worker Memory or per executors) 60*0.3 or 10*(6*0.3), though both yields same?