How a spark application starts using sbt run.

Question

I actually want to know the underlying mechanism of how this happens that when I execute sbt run the spark application starts !

What is the difference between this and running spark on standalone mode and then deploying application on it using spark-submit.

If someone can explain how the jar is submitted and who makes the task and assigns it in both the cases, that would be great. Please help me out with this or point to some read where i can make my doubts cleared !

score 3 · Answer 1 · answered May 03 '17 at 12:50

First, read this.

Once you are familiar with the terminologies, different roles, and their responsibilities, read below paragraph to summarize.

There are different ways to run a spark application(a spark app is nothing but a bunch of class files with an entry point).

You can run the spark application as single java process(usually for development purposes). This is what happens when you run sbt run. In this mode, all the services like driver, workers etc are run inside a single JVM.

But above way of running is only for development and testing purposes as it won't scale. That means you won't be able to process a huge amount of data. This is where other ways of running a spark app come into the picture(Standalone, mesos, yarn etc).

Now read this.

In these modes, there will be dedicated JVMs for different roles. Driver will be running as a separate JVM, there could be 10s to 1000s of executor JVMs running on different machines(Crazy right!).

The interesting part is, the same application that runs inside a single JVM will be distributed to run on 1000s of JVMs. This distribution of the application, life-cycle of these JVMs, making them fault-tolerance etc are taken care by Spark and the underlying cluster frameworks.

"_when you run sbt run. In this mode, all the services like driver, workers etc are run inside a single JVM_" - This is not true. You can add `deployMode("cluster")` in the code — OneCricketeer, Apr 24 '23 at 14:10

How a spark application starts using sbt run.

1 Answers1

Linked