2

I'm trying to set timeout on workflow run in oozie, which fails after some time from start of workflow run.

For example, one can use sla:should-end within coordinator.xml for action/workflow, or specify sla:should-end within workflow.xml for entire workflow.
But the problem is SLA only sends email. What I want is just fail after some time (from start of run).
Is it possible? Any sample code would help.

Nishu Tayal
  • 20,106
  • 8
  • 49
  • 101
Tay Cho
  • 51
  • 1
  • 5
  • Were you able to successfully use Oozie's SLA feature? If yes, can you please help answer - https://stackoverflow.com/questions/57281650/oozie-not-sending-sla-email-alerts – Saurabh Agrawal Jul 31 '19 at 04:15

1 Answers1

1

I don't know of any straight solution for this in Oozie or in Yarn. There is a Yarn ticket that would provide a convenient solution.

Until it's implemented you can try something like this:

  • add a FS touchz action to the beginning of the workflow to create a file ( e.g. /tmp/WF_ID )
  • add a fork to the workflow after the file is created
  • one of the paths should be a shell action that checks for the existence of the file with hdfs dfs -ls /tmp/WF_ID until a timeout is reached ( see this post for some hints )
  • the other path is the original workflow logic and an FS delete action at the end to delete the file in /tmp
  • the shell action should kill the workflow if it times out before the file is deleted from HDFS. If the file is deleted before the timeout is reached, the shell script should terminate normally, letting the workflow to continue

This is quite an ugly workaround of the problem, but I can't think of a cleaner solution at the moment.

Community
  • 1
  • 1
gezapeti
  • 381
  • 1
  • 3