68

Given the following function call:

f(g(), h())

since the order of evaluation of function arguments is unspecified (still the case in C++11 as far as I'm aware), could an implementation theoretically execute g() and h() in parallel?

Such a parallelisation could only kick in were g and h known to be fairly trivial (in the most obvious case, accessing only data local to their bodies) so as not to introduce concurrency issues but, beyond that restriction I can't see anything to prohibit it.

So, does the standard allow it? Even if only by the as-if rule?

(In this answer, Mankarse asserts otherwise; however, he does not cite the standard, and my read-through of [expr.call] hasn't revealed any obvious wording.)

Community
  • 1
  • 1
Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • 1
    I would think that this falls along the same lines as any other optimization. As long as it respects the as-if rule. – Mysticial Nov 18 '12 at 19:10
  • in my view, it theoretically _can_ parallelize the execution of g() and h(). I am not sure if the standard prohibits or allows, but my idea is that it should – Aniket Inge Nov 18 '12 at 19:11
  • 14
    it could start a rocket to ask the moon men to tell it the result of the function call and travel back. – Johannes Schaub - litb Nov 18 '12 at 19:31
  • 1
    As you say, this would kick in only for fairly trivial `g` and `h` (in which case the compiler probably could prove it's equivalent to some sequencial evaluation anyway), and then it wouldn't really have much benefit: creating a thread would probably be more expensive than evaluating either of the functions. — In case of bigger functions, where it could _in principle_ boost performance greatly, it would be hard to maintain control over the possible explosion of recursive thread spawns. Even Haskell doesn't do this, which has _everywhere_ freedom of evaluation order and has lightweight threads. – leftaroundabout Nov 18 '12 at 23:36
  • @leftaroundabout: Not suggesting you'd want to do it. – Lightness Races in Orbit Nov 19 '12 at 01:01
  • 1
    If we had a `pure` specifier for functions that tells the implementation that said function doesn't access (possibly) shared state (aka doesn't have side effects), this might become possible in the future. – Xeo Nov 19 '12 at 06:23
  • unfortunately your question is very ambiguous. in your wuestion you are asking whether a compiler could do something that you cannot notice (because the involved functions are too trivial), but you accepted an answer that explains a rule forbidding noticable differences. I cannot see why you didnt pick an example with two print statements then. – Johannes Schaub - litb Nov 24 '12 at 13:10
  • There is probably not much point in such a finegrained parallelization since the functions g and h would have to be very simple such that the compiler could decide they are parallelizable, and if they are so simple the synchronization overhead between the processors would destroy the gain from the parallelization. There are other languages that are much more suited to such operations. – Dr. Hans-Peter Störr Nov 26 '12 at 20:04
  • @hstoerr: No argument regarding that. – Lightness Races in Orbit Nov 26 '12 at 20:07
  • @MooingDuck: Unfortunately, `constexpr` is neutered. – Lightness Races in Orbit Dec 10 '12 at 22:48
  • I wouldn't depend on such a thing at all even if standard says so. Why? Because this is such a corner case that most people wouldn't know when they change things later. Even of g() and h() are called withing function calls I would use synchronization to prevent race condition. – Xolve Mar 08 '13 at 10:07

4 Answers4

42

The requirement comes from [intro.execution]/15:

... When calling a function ... Every evaluation in the calling function (including other function calls) that is not otherwise specifically sequenced before or after the execution of the body of the called function is indeterminately sequenced with respect to the execution of the called function [Footnote: In other words, function executions do not interleave with each other.].

So any execution of the body of g() must be indeterminately sequenced with (that is, not overlapping with) the evaluation of h() (because h() is an expression in the calling function).

The critical point here is that g() and h() are both function calls.

(Of course, the as-if rule means that the possibility cannot be entirely ruled out, but it should never happen in a way that could affect the observable behaviour of a program. At most, such an implementation would just change the performance characteristics of the code.)

Mankarse
  • 39,818
  • 11
  • 97
  • 141
  • That's the one. It's easy to get carried away with thinking that this only applies to `g()` vs `f()`, and `h()` vs `f()`, but indeed the net effect is that the `g()` and the `h()` cannot be interleaved. – Lightness Races in Orbit Nov 19 '12 at 01:00
  • I think my original confusion was in reading the "indeterminately" more _strongly_ than the "sequenced", in **indeterminately sequenced**. Indeed, "sequenced" is the key term, as I imagine that interleaved calls would have no "sequence" with respect to each other. – Lightness Races in Orbit Jun 20 '13 at 17:39
  • This answer is correct, but it is worth noting that this is only so because neither of the calls `g()` and `h()` take any arguments. If there had been, the evaluation of these arguments (in preparation for the execution of `f` or `h`) would **not** be indeterminately sequenced with each other (so they could be interleaved), though they would be indeterminately sequenced with respect to the body of the other one of `f` and `h`. – Marc van Leeuwen Aug 31 '16 at 08:08
16

As long as you can't tell, whatever the compiler does to evaluate these functions is entirely up to the compiler. Clearly, the evaluation of the functions cannot involve any access to shared, mutable data as this would introduce data races. The basic guiding principle is the "as if"-rule and the fundamental observable operations, i.e., access to volatile data, I/O operations, access to atomic data, etc. The relevant section is 1.9 [intro.execution].

Dietmar Kühl
  • 150,225
  • 13
  • 225
  • 380
  • I guess I was unclear. I acknowledge that it couldn't be applied in the general case. But in the cases where the behaviour of functions themselves weren't a blocker to this possibility... – Lightness Races in Orbit Nov 18 '12 at 19:26
  • 4
    I thought *I* was clear: If the compiler can tell that it won't violate its constraints, it can do whatever it feels to do. Put differently: As long as you can't tell (other than possibly measuring the time it takes) that it executed the functions in parallel, it is free to do so. – Dietmar Kühl Nov 18 '12 at 19:28
  • I read your answer slightly differently the first time around. Did it change at all in the five-minute grace window? Or am I going slightly mad? – Lightness Races in Orbit Nov 18 '12 at 19:38
  • 1
    @LightnessRacesinOrbit: Well, it changed shortly after I first posted but not substantially: I fixed "relecant" to be "relevant" but I don't think this changed the semantics ;) – Dietmar Kühl Nov 18 '12 at 19:44
  • Good point. The spec defines the semantics, not the implementation, so if there's no semantic difference in parallelizing, then the compiler can do it. – dhasenan Nov 18 '12 at 23:27
3

Not unless the compiler knew exactly what g(), h(), and anything they call does.

The two expressions are function calls, which may have unknown side effects. Therefore, parallelizing them could cause a data-race on those side effects. Since the C++ standard does not allow argument evaluation to cause a data-race on any side effects of the expressions, the compiler can only parallelize them if it knows that no such data race is possible.

That means walking though each function and look at exactly what they do and/or call, then tracking through those functions, etc. In the general case, it's not feasible.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
1

Easy answer: when the functions are sequenced, even if indeterminately, there is no possibility for a race condition between the two, which is not true if they are parallelized. Even a pair of one line "trivial" functions could do it.

void g()
{
    *p = *p + 1;
}


void h()
{
    *p = *p - 1;
}

If p is a name shared by g and h, then a sequential calling of g and h in any order will result in the value pointed to by p not changing. If they are parallelized, the reading of *p and the assigning of it could be interleaved arbitrarily between the two:

  1. g reads *p and finds the value 1.
  2. f reads *p and also finds the value 1.
  3. g writes 2 to *p.
  4. f, still using the value 1 it read before will write 0 to *p.

Thus, the behavior is different when they are parallelized.

nevsan
  • 1,385
  • 9
  • 8