I was running spark sql on Yarn and I met the same issue like below link: Spark: long delay between jobs
There's a long delay post the action which was saving table. On Spark UI, I could see the particular saveAsTable() job was completed but there's no any new job was submitted. spark ui screenshot
In the first link, the answer said I/O operations will occur on master node but I doubt that.
At the gap time, I checked hdfs where I saved the tables, then I could see _temporary file rather than _success file. it looks like the answer is truth and spark was saving table on driver end. Why?!!
I'm using below code to save table:
dataframe.write.partitionBy(partitionColumn)).format(format)
.mode(SaveMode.Overwrite)
.saveAsTable(s"$tableName")
BTW, the format is orc format file. anyone can give me some suggestions? :) thx in advance.