0

Environment: Spark 1.6.2; Linux 2.6.x (Red Hat 4.4.x); Hadoop 2.4.x.

I launched a job this morning through spark-submit but do not see the files it was supposed to write. I've read a bit about the web UI for monitoring spark jobs, but at this point, my only visibility into what is happening on the Hadoop cluster and HDFS is through a bash-shell terminal.

Question: what are the standard ways from the command-line to get a quick readout on spark jobs, and any log trail they might leave behind (during or after job execution)?

Thanks.

Driss NEJJAR
  • 872
  • 5
  • 22
Kode Charlie
  • 1,297
  • 16
  • 32
  • If you know the application ID, have you tried this? https://stackoverflow.com/questions/37420537/how-to-check-status-of-spark-applications-from-the-command-line – ar7 Dec 21 '18 at 23:33
  • 1
    On which master are you running your job ? If it's on yarn, you can try : yarn logs -applicationId – Driss NEJJAR Dec 22 '18 at 05:22
  • We're on `yarn`. If my job prints the `SparkContext.applicationId`, then `yarn` will tell me a lot about that job. But my question is more general: is there a shell-command that lists *all* jobs queued or running? If you're from a *nix background, the equivalent command would be `ps`. – Kode Charlie Dec 26 '18 at 16:42
  • May be yarn application -list ? – Driss NEJJAR Dec 30 '18 at 10:37
  • 1
    @DrissNejjar -- thx, I'll give you credit if you post your response as an answer. – Kode Charlie Jan 02 '19 at 18:47

1 Answers1

0

You can use yarn application -list

Driss NEJJAR
  • 872
  • 5
  • 22