1. Building a dependency graph between the assignments
I take it your goal is to compute as much assignments in parallel as possible while obtaining the same result as if you were sequentially computing the assignments one after another. To identify what can be done in parallel, I would suggest building a dependency graph between your assignment expression.
Starting from the last assignment: Any variable that participates in the assignment means that to compute it, we need to compute the last assignment that made a modification to the variable involved. With the example you gave:
&0 = 3 // A0 (variable initialization)
&1 = 6 // A1 (variable initialization)
&0 = &0 + 4 + 3 // A2
&1 = &0 - &1 + 2 + 5 // A3
&0 = &1 + 3 // A4
The dependencies are:
A4 <-- A3 (To compute A4, you need to compute A3 before)
A3 <-- A2 and A1 (To compute A3, you need to compute A2 for the value of &0 and A1 for &1)
A2 <-- A0
A1 <-- no dependency
A0 <-- no dependency
2. Recursively computing the dependencies
Disclaimer: I am not very familiar with the use of locks and threads in C++. But I think I can give an general approach.
Now that you have established what the dependencies are, you can try to compute them. I can suggest this simplistic approach for a procedure of a thread:
computeAssignment(Assignment * as) {
if (as is not computed) {
lock Assignment as
for each dependency of as {
spawn a thread running computeAssignment(dependency) // Recursion occurs here
}
wait for all the threads spawned in the for loop to finish
compute the result of this assignment
unlock Assignment as
}
return result of as
}
When a thread starts to examine the assignment it was given, it first checks if it was computed before. This check must be done in a manner that makes the first thread to make this check block access to all other threads if the assignment is not computed.
While the first thread is busy spawning threads to compute dependencies and computing the result of the assignment, the other threads are blocked on this first check. When the first thread eventually unlocks the assignment, all the threads that were waiting will see that the result was computed.
To start the computation, you should spawn one of these threads for each variable you have and giving them the last assignment that modifies the variable as parameter. In your example, that would be 2 threads starting with assignments A4 (last to modify &0) and A3 (last to modify &1).