3

So I understand that re-usage of a variable that has been post incremented is undefined behavior in a function call. My understanding is this is not a problem in constructors. My question is about tie which is oddly halfway between each.

Given: pair<int, int> func() can I do:

tie(*it++, *it) = func();

Or is that undefined behavior?

Jonathan Mee
  • 37,899
  • 23
  • 129
  • 288
  • 3
    `std::tie` *is* a function. I don't see why it would be any different. – super Jul 11 '19 at 16:20
  • 1
    In C++17, this is no longer UB—although whether the second argument is the incremented value or not is *unspecified*. – Davis Herring Jul 11 '19 at 23:01
  • @DavisHerring Isn't that by definition UB? – Jonathan Mee Jul 12 '19 at 15:30
  • @DavisHerring Ahh... I do love those nasal demons. So I guess we're saying this is now *compiler defined how* it will behave? Equally difficult really, just maybe not so disastrous. – Jonathan Mee Jul 12 '19 at 15:54
  • @JonathanMee: It’s not that either (called *implementation-defined*); one or the other behavior will happen on each call, with no guarantee of consistency even between multiple calls in a loop. (In practice, it’s rather unlikely to vary within one compiler version, but could very well change with any change to the code.) – Davis Herring Jul 12 '19 at 16:02
  • What do you mean by "this is not a problem in constructors"? Are you comparing parenthesized argument lists with braced initializer lists? – Ben Voigt Jul 14 '19 at 03:39
  • @BenVoigt I was wondering if `tie` derived any of the initializer list properties since it can be used for initialization. – Jonathan Mee Jul 14 '19 at 17:34
  • 1
    @JonathanMee: No, because you aren't using a braced initializer list. Construction doesn't get the ordering guarantee either, if using parentheses. – Ben Voigt Jul 15 '19 at 05:26

1 Answers1

3

Since C++17, this code has unspecified behavior. There are two possible outcomes:

  • the first argument is the result of dereferencing the original iterator, the second argument is the result of dereferencing the incremented iterator; or

  • the first argument and the second argument are both the result of dereferencing the original iterator.

Per [expr.call]/8:

[...] The initialization of a parameter, including every associated value computation and side effect, is indeterminately sequenced with respect to that of any other parameter. [...]

So the second argument to tie may be either the result of dereferencing the incremented iterator or the original iterator.


Prior to C++17, the situation was a bit complicated:

  • if both ++ and * invoke a function (e.g., when the type of it is a sophisticated class), then the behavior was unspecified, similar to the case since C++17;

  • otherwise, the behavior was undefined.

Per N4140 (C++14 draft) [expr.call]/8:

[ Note: The evaluations of the postfix expression and of the arguments are all unsequenced relative to one another. All side effects of argument evaluations are sequenced before the function is entered (see [intro.execution]). — end note ]

Thus, the code was undefined behavior because the evaluation of one argument was unsequenced with the other. The evaluation of the two arguments may overlap, resulting in a data race. Unless it is specified otherwise ...

Per N4140 [intro.execution]/15:

When calling a function (whether or not the function is inline), every value computation and side effect associated with any argument expression, or with the postfix expression designating the called function, is sequenced before execution of every expression or statement in the body of the called function. [ Note: Value computations and side effects associated with different argument expressions are unsequenced. — end note ] Every evaluation in the calling function (including other function calls) that is not otherwise specifically sequenced before or after the execution of the body of the called function is indeterminately sequenced with respect to the execution of the called function.9 Several contexts in C++ cause evaluation of a function call, even though no corresponding function call syntax appears in the translation unit. [ Example: Evaluation of a new expression invokes one or more allocation and constructor functions; see [expr.new]. For another example, invocation of a conversion function ([class.conv.fct]) can arise in contexts in which no function call syntax appears. — end example ] The sequencing constraints on the execution of the called function (as described above) are features of the function calls as evaluated, whatever the syntax of the expression that calls the function might be.

9) In other words, function executions do not interleave with each other.

Thus, if the operators are actually function calls, then the behavior is similarly unspecified.

L. F.
  • 19,445
  • 8
  • 48
  • 82
  • I believe the "indeterminately sequenced with respect to that of any other parameter" is still a problem, as I cannot be certain that the post increment will happen *between* the dereferences. – Jonathan Mee Jul 12 '19 at 15:32
  • @JonathanMee “indeterminately sequenced” means either the evaluation of the first argument is sequenced before (read: completed) before the evaluation of the second argument (in which case the first argument is the original iterator and the second argument is the incremented iterator) or the other way around (in which case both arguments are the original iterator). – L. F. Jul 13 '19 at 00:02
  • 1
    You say "the first argument is the original iterator", however the iterator is deferenced. The possible cases are that the arguments must be the first 2 locations the iterator "points" but could be in either order – M.M Jul 14 '19 at 03:30
  • 1
    @M.M What if the second argument is evaluated first? – L. F. Jul 14 '19 at 03:35
  • 1
    Another detail: iterators are usually class types; so `++` translates to a function call and therefore, prior to C++17, the behaviour would have been unspecified. It would only be UB if the iterator was a raw pointer. – M.M Jul 14 '19 at 03:42
  • @M.M `*it` may access some internal state when `operator++` is halfway done, I suppose? – L. F. Jul 14 '19 at 03:46
  • 1
    Function bodies are indeterminately sequenced (or in pre-c++11 terminology, there's an entry point on entry and exit of a function, or in laymans terms, the function bodies can't interleave). `foo( f(x), g(x) );` is perfectly fine, even if f and g both modify `x` by reference, or access a global variable for example. – M.M Jul 14 '19 at 03:47
  • @M.M I didn't know that before. Thank you for teaching me this! – L. F. Jul 14 '19 at 04:00
  • 1
    Just as `operator++` being a function prevents overlapping/interleaving of the operation, would it not also be sufficient for unary `operator*` to be a function (and then `operator++` doesn't need to be)? (Note, it actually is allowed for the compiler to interleave multiple function calls, under the as-if rule, as long as the outcome under sequential consistency rules is identical to some permitted non-overlapping ordering) – Ben Voigt Jul 15 '19 at 05:28
  • @BenVoigt If `operator++` is not a function, won't it be possible that `operator*` receives an iterator on which the increment is halfway done, I guess? – L. F. Jul 15 '19 at 05:29
  • 1
    @L.F. I suppose you refer to the `operator*` in the other parameter? Because the `operator*` in the first parameter is definitely sequenced after value computation of the `operator++`. But if that's undefined, then so is the case where only `operator++` is a function, because it could be invoked in the middle of the other operand's dereference. Dereference being interrupted by increment seems just as harmful as dereference on an incomplete increment. So I suppose you require both to be functions. – Ben Voigt Jul 15 '19 at 05:33
  • 1
    Good application of DeMorgan's Theorem in your edit, but I was under the impression we were leaning toward "unspecified" if BOTH are functions and "undefined" otherwise? – Ben Voigt Jul 15 '19 at 05:45