What decides the number of jobs in a spark application

Asked Mar 23 '21 at 18:06

Active Mar 02 '22 at 16:24

Viewed 1,543 times

Previously my understanding was , an action will create a job in spark application. But let's see below scenario where I am just creating a dataframe using .range() method

df=spark.range(10)

Since my spark.default.parallelism is 10, resultant dataframe is of 10 partitions. Now I am just performing a .show() and .count() actions on dataframe

df.show()
df.count()

Now when I have checked spark history I can see 3 jobs for .show() and 1 job for .count()

Why 3 jobs are here for .show() method?

I have read some where .show() will eventually call .take() internally and it will iterate through partitions which decides the number of jobs . But I didn't understand that part? What exactly decides the number of jobs ?

asked Mar 23 '21 at 18:06

akhil pathirippilly

What decides the number of jobs in a spark application

0 Answers0

Linked