4

I am learning how to use spark and I have a simple program.When I run the jar file it gives me the right result but I have some error in the stderr file.just like this:

 15/05/18 18:19:52 ERROR executor.CoarseGrainedExecutorBackend: Driver   Disassociated [akka.tcp://sparkExecutor@localhost:51976] -> [akka.tcp://sparkDriver@172.31.34.148:60060] disassociated! Shutting down.
 15/05/18 18:19:52 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkDriver@172.31.34.148:60060] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].

You can get the whole stderr file in there:

http://172.31.34.148:8081/logPage/?appId=app-20150518181945-0026&executorId=0&logType=stderr

I searched this problem and find this:

Why spark application fail with "executor.CoarseGrainedExecutorBackend: Driver Disassociated"?

And I turn up the spark.yarn.executor.memoryOverhead as it said but it doesn't work.

I just have one master node(8G memory) and in the spark's slaves file there is only one slave node--the master itself.I submit like this:

./bin/spark-submit --class .... --master spark://master:7077 --executor-memory 6G --total-executor-cores 8 /path/..jar hdfs://myfile

I don't know what is the executor and what is the driver...lol... sorry about that..

anybody help me?

Community
  • 1
  • 1
赵祥宇
  • 497
  • 3
  • 9
  • 19
  • 172.31.34.148 is a private address, we could not see it. – yjshen May 19 '15 at 07:00
  • please share if you have figured out the solution. I am facing the same error. http://ec2-54-174-186-17.compute-1.amazonaws.com:8080/ – Rogers Jefrey L Jun 17 '15 at 22:14
  • any updates on this ? – Kalyanaraman Santhanam Jul 12 '15 at 06:31
  • This happens if the Spark Driver fails (memory issue, node restart etc.), and [by default it is not fault-tolerant](http://stackoverflow.com/questions/26618464/what-happens-if-the-driver-program-crashes). `spark.yarn.driver.memoryOverhead` param can help with memory based issues. – CᴴᴀZ Apr 21 '17 at 09:56

2 Answers2

4

If Spark Driver fails, it gets disassociated (from YARN AM). Try the following to make it more fault-tolerant:

  • spark-submit with --supervise flag on Spark Standalone cluster
  • yarn-cluster mode on YARN
  • spark.yarn.driver.memoryOverhead parameter for increasing Driver's memory allocation on YARN

Note: Driver supervisation (spark.driver.supervise) is not supported on a YARN cluster (yet).

CᴴᴀZ
  • 521
  • 7
  • 20
1

An overview of driver vs. executor (and others) can be found at http://spark.apache.org/docs/latest/cluster-overview.html or https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-architecture.html

They are java processes that could run in different or the same machine depending on your configuration. Driver contains the SparkContext, declares the RDD transformation (and if I'm not mistaken - think execution plan) then communicates that to the spark master which creates task definitions, asks the cluster manager (it's own,yarn, mesos) for resources (worker nodes) and those tasks in turn gets sent to executors (for execution).

Executors communicate back to master certain information and as far as I understand if the driver encounters a problem or crashes, the master will take note and will tell the executor (and it in turn logs) what you see "driver is disassociated". This could be because of a lot of things but the most common ones are because the java process (driver) runs out of memory (try increasing spark.driver.memory)

Some differences when running on Yarn vs Stand-alone vs Mesos but hope this helps. If driver is disassociated, the java process running (as the driver) likely encountered an error - the master logs might have something and not sure if there are driver specific logs. Hopefully someone more knowledgeable than me can provide more info.