Spark Pi Example in Cluster mode with Yarn: Association lost

Question

I have three virtual machines running as distributed Spark cluster. I am using Spark 1.3.0 with an underlying Hadoop 2.6.0.

If I run the Spark Pi example

/usr/local/spark130/bin/spark-submit 
--class org.apache.spark.examples.SparkPi  
--master yarn-client /usr/local/spark130/examples/target/spark-examples_2.10-1.3.0.jar  10000

I get this warning/errors and eventually an exception:

 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/04/08 12:37:06 WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkYarnAM@virtm4:47128] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
15/04/08 12:37:12 WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkYarnAM@virtm4:45975] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
15/04/08 12:37:13 ERROR YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED!

When I check the logs of the container I see that it was SIGTERM-ed

15/04/08 12:37:08 INFO yarn.YarnAllocator: Container request (host: Any, capability: <memory:1408, vCores:1>)
15/04/08 12:37:08 INFO yarn.YarnAllocator: Container request (host: Any, capability: <memory:1408, vCores:1>)
15/04/08 12:37:08 INFO yarn.ApplicationMaster: Started progress reporter thread - sleep time : 5000
15/04/08 12:37:12 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL 15: SIGTERM
15/04/08 12:37:12 INFO yarn.ApplicationMaster: Final app status: UNDEFINED, exitCode: 0, (reason: Shutdown hook called before final status was reported.)
15/04/08 12:37:12 INFO yarn.ApplicationMaster: Unregistering ApplicationMaster with UNDEFINED (diag message: Shutdown hook called before final status was reported.)

SOLUTION: I solved the problem. I use Java7 now instead of Java8. This situation was reported as bug, but it was rejected as such https://issues.apache.org/jira/browse/SPARK-6388 Yet, changing Java version did work.

I acknowledge. I faced the same issue. After changing the jdk version to 7, it worked. I guess some issue with oracle-8-jdk! — vivek_nk, Sep 11 '15 at 05:45

score 4 · Answer 1 · answered Sep 12 '16 at 19:37

The association may be lost due to the Java 8 excessive memory allocation issue: https://issues.apache.org/jira/browse/YARN-4714

You can force YARN to ignore this by setting up the following properties in yarn-site.xml

<property>
    <name>yarn.nodemanager.pmem-check-enabled</name>
    <value>false</value>
</property>

<property>
    <name>yarn.nodemanager.vmem-check-enabled</name>
    <value>false</value>
</property>

score 0 · Answer 2 · answered Apr 09 '15 at 17:10

0

I encountered similar problem before, until I find this issue

Try to stop your SparkContext instance explicitly sc.stop()

answered Apr 09 '15 at 17:10

Jonathan

33
5

I tried your idea under Java 8, the problem remains. It seems to be a Java8 problem that is not acknowledged by the developers (see my updated question above). – toobee Apr 10 '15 at 08:00

Spark Pi Example in Cluster mode with Yarn: Association lost

2 Answers2

Linked