I have the task of translating a list of dependent nodes into AWS Step Functions. The AWS Step Function definition allows for parallel branches or even branches nested to multiple levels deep. Unfortunately it does not support dependencies between tasks in the branches and therefore forces you to complete the parallel step before both results are available to subsequent tasks in the step function.
In my diagram delow a simple parallel branch like shown in Graph 1 is easily supported by Step Functions.
When it comes to Graph 2 and especially Graph 3 it becomes a problem.
As a simple approach we could introduce additional nodes to collect the results together for their dependent nodes as demonstrated in Graphs 2b and 3b but this now introduces dependencies that didn't exist before:
- In Graph 2b these new dependencies have been introduced:
- E -> A, F -> A
- In Graph 3b these new dependencies have been introduced:
- E -> A, F -> A, F-> B, C -> E
This is a problem because in the case of manual approval tasks the time for these tasks could be in the order of hours to days. This would cause later steps to be delayed unnecessarily by tasks that they do not have dependencies on.
Any suggestions on how to solve this? Maybe I could take a different approach? Maybe there is some fancy graph theory algorithm I can apply? I don't even know what words to use to explain this problem in graph theory.
Here is a url for playing with these graphs on draw.io if you need to.