4

I have submitted spark job through airflow sometimes job works and sometimes it don't give output at all . Even after 2-3 hrs of waiting job is not giving any detail apart from Waiting for job output...

I am using dataproc-1-4-deb10

Its simple job like pulling data from jdbc using pysaprk. Also it works without error sometimes and sometimes doesnt at all.

  • Can you share details about the job? Also, can you check behavior by just running simple PiSpark job multiple times and verify if you get job output? – Animesh Jul 12 '21 at 23:39
  • @Animesh Is is possible due to yarn failure . I dont know anything about hadoop just a guess. – Shubham Asabe Jul 13 '21 at 15:05
  • I m facing the same. The same job is already running but for some reason gcloud probably haven't even submitted to yarn cluster. I don't see the jobs as submitted in yarn UI. – Piyush Patel Jul 22 '22 at 00:57
  • 1
    @ShubhamAsabe I facing the same issue. Did you find any solution to this? – Ravi Jain May 25 '23 at 17:23
  • @RaviJain Nope just restarting the job worked for me but suggesting you to raise a google ticket. – Shubham Asabe May 25 '23 at 17:27
  • 1
    @ShubhamAsabe what I have figured out is that the job was stuck at the accepted state due to no resources (memory) available in the yarn queue. However, the cluster had a lot of resources available. I tried the below properties but still, each application submit took 896m for AM. and after certain apps the submission will stuck. yarn.scheduler.capacity.maximum-am-resource-percent changed from 0.25 to 0.95 yarn:yarn.nodemanager.resource.memory-mb=13544 – Ravi Jain Jun 07 '23 at 05:57

0 Answers0