0

The big data support on a client is telling me do change de deploy mode of my application from client to cluster. The idea behind this is that one application running on local mode can take away too much resources on the machine.

I was not able to find any reference in Spark documentation about that resource consumption, and my jobs was entirely redesigned to run locally due to many *.json and *.sql required to run correctly. My understand of the Spark docs is that the driver dispatches all tasks to the cluster and only coordinates its sequences and statuses, and because of that I don't need to worry about that with the resource consumption.

Is that correct? Can someone point me some docs where I can learn more about this?

My environment is running Spark 2.1.1.

tk421
  • 5,775
  • 6
  • 23
  • 34
Francisco Soares
  • 179
  • 1
  • 2
  • 8

1 Answers1

0

You can find details at Apache Spark: Differences between client and cluster deploy modes. However as far I understand what they might be saying is that on client mode you might be consuming resource of your non spark cluster machine. With cluster mode you are fully in spark cluster and no outside resource get consumed.

Rishi Saraf
  • 1,644
  • 2
  • 14
  • 27