3

My question is similar to other posters reporting on "Initial job has not accepted any resources". I read their suggestions and still not able to submit the job from Java. I am wondering if somebody with more experience installing Spark sees an obvious miss or knows how to troubleshoot this?

Spark : check your cluster UI to ensure that workers are registered.

My configuration is as follows: (VM Fedora) MASTER: version 2.0.2, prebuilt w/ hadoop. WORKER: single instance.

(Host/Windows Java app) Client is a sample JavaApp, configured with

conf.set("spark.cores.max","1");
conf.set("spark.shuffle.service.enabled", "false");
conf.set("spark.dynamicAllocation.enabled", "false");

Attached is a snapshot of Spark UI. As far as I can tell my job is received, submitted and running. It also appears that I am not over-utilizing CPU and RAM.

enter image description here

Java(client) console reports

12:15:47.816 DEBUG parentName: , name: TaskSet_0, runningTasks: 0
12:15:48.815 DEBUG parentName: , name: TaskSet_0, runningTasks: 0
12:15:49.806 WARN Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
12:15:49.816 DEBUG parentName: , name: TaskSet_0, runningTasks: 0
12:15:50.816 DEBUG parentName: , name: TaskSet_0, runningTasks: 0

Spark worker log reports.

16/11/22 12:16:34 INFO Worker: Asked to launch executor app-20161122121634-0012/0 for Simple 
Application
16/11/22 12:16:34 INFO SecurityManager: Changing modify acls groups to: 
16/11/22 12:16:34 INFO SecurityManager: SecurityManager: authentication disabled; ui acls dis
abled; users  with view permissions: Set(john); groups with view permissions: Set(); users 
 with modify permissions: Set(john); groups with modify permissions: Set()
16/11/22 12:16:34 INFO ExecutorRunner: Launch command: "/apps/jdk1.8.0_101/jre/bin/java" "-cp " "/apps/spark-2.0.2-bin-hadoop2.7/conf/:/apps/spark-2.0.2-bin-hadoop2.7/jars/*" "-Xmx1024M" "-Dspark.driver.port=29015" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@192.168.56.1:29015" "--executor-id" "0" "--hostname" "192.168.56.103" "--cores" "1" "--app-id" "app-20161122121634-0012" "--worker-url" "spark://Worker@192.168.56.103:38701"

enter image description here

Community
  • 1
  • 1
Vortex
  • 789
  • 12
  • 21
  • Try to kill the running Application and see what happens ! And let us know too – Shivansh Nov 22 '16 at 18:25
  • I tried stopping client many times. UI displays application in the completed section. That's wrong because the job did not actually execute. You could see in the attached image, "Simple Application" - >"Finished" The worker log shows 16/11/22 12:17:12 INFO Worker: Asked to kill executor app-20161122121634-0012/0 16/11/22 12:17:12 INFO ExecutorRunner: Runner thread for executor app-20161122121634-0012/0 interrupted 16/11/22 12:17:12 INFO ExecutorRunner: Killing process! 16/11/22 12:17:13 INFO Worker: Executor app-20161122121634-0012/0 finished with state KILLED , exitStatus 143 – Vortex Nov 22 '16 at 18:32
  • First Try to submit the application , then see if it still say `Initial Job has not accepted any resources` then go to the Spark UI to see how many appplications are submitted.My hunch is , one application would be in waiting and the one would be executing , and consuming all resource ! Then try to kill the running application and see what happens ? – Shivansh Nov 22 '16 at 18:35
  • I checked. There was only one job running. I added 2nd image to my post. I can't think of anything else to look into. Any thoughts? FYI. Running spark-submit computed Pi fine. spark-submit --verbose --class org.apache.spark.examples.SparkPi --master spark://192.168.56.103:7077 ../examples/jars/spark-examples_2.11-2.0.2.jar – Vortex Nov 22 '16 at 19:36
  • This must not happen , If SparkPi is running fine then , in that case the server configuration is correct ! What is your spark Conf look like and Are you trying to submit multiple applications at once from the code ? – Shivansh Nov 23 '16 at 02:11
  • Shivansh, first of all thank you for looking into it. My setup is Windows (host) and Linux on VirtualBox(guest). Spark master and single worker are started in the guest. SparkPi works fine from Linux. I can't submit any job from Windows. I am seeing this error in the stderr in Web UI. I believe it means master bound itself to the localhost, even though I executed ./start-master.sh -h 192.168.56.103. // from worker's log in web UI Caused by: java.io.IOException: Failed to connect to /127.0.0.1:40055 – Vortex Nov 23 '16 at 04:13
  • Is your spark cluster running on yarn? – Santanu Dey Jun 16 '17 at 05:13
  • Have a look at https://stackoverflow.com/a/44581586/808096 – Santanu Dey Jun 16 '17 at 05:47

1 Answers1

0

Do you have any firewall blocking communications? As stated in my other answer:

Apache Spark on Mesos: Initial job has not accepted any resources:

While most of other answers focuses on resource allocation (cores, memory) on spark slaves, I would like to highlight that firewall could cause exactly the same issue, especially when you are running spark on cloud platforms.

If you can find spark slaves in the web UI, you have probably opened the standard ports 8080, 8081, 7077, 4040. Nonetheless, when you actually run a job, it uses SPARK_WORKER_PORT, spark.driver.port and spark.blockManager.port which by default are randomly assigned. If your firewall is blocking these ports, the master could not retrieve any job-specific response from slaves and return the error.

You can run a quick test by opening all the ports and see whether the slave accepts jobs.

Community
  • 1
  • 1
Fontaine007
  • 577
  • 2
  • 8
  • 18