Imagine I have a directed acyclic graph (DAG) with vertices and edges.
A vertex can be one of the following two types:
- A computational task (T)
- A resource (R)
An edge represents a dependency. It ALWAYS originates from some computational task vertex T and ends in some resource vertex R.
Restrictions on the graph structure:
- Task vertices depend only on resource vertices (not on other tasks). This means there are ONLY outgoing edges from computational tasks, and there are ONLY incoming edges into resources.
- A task vertex cannot have more than one edge to the same resource vertex.
- Resource vertices don't depend on anything (no outgoing edges).
- Each computational task vertex can have a minimum of 3 outgoing edges, and a maximum of 4 outgoing edges. This means, a computational task depends on a minimum of 3 resources and a maximum of 4.
Semantics:
- The above graph is a task dependency graph. Whenever a task Tx runs, it blocks ALL the resources it depends on until it completes.
- Each resource cannot be used by more than 1 task at any given time. So, tasks can briefly prevent other tasks from running until they are done.
Question:
Given the above graph, what algorithm can I use to compute all possible tasks that I can run in parallel such that they don't block each other? i.e. At any given time, I want to be able to achieve maximum parallelization. I will use the algorithm to discover all the tasks that don't block each other, and run them. Whenever a task completes, I want to re-evaluate the graph to see if I can spin off more tasks that got unblocked.
I'm looking to know the algorithm I could use for this kind of computation. This sounds like a hard graph problem, but I suspect this kind of problem is not entirely completely new....
Example:
In the provided example, I can first run T1 and T3. When these are done, I can run T2 and T4.