1

I am trying to learn how to explain the cause of(if any) of undefined behavior in the following cases(given below).

int i = 0, *ptr = &i;
i = ++i; //is this UB? If yes then why according to C++11
*ptr = (*ptr)++; //i think this is UB but i am unable to explain exactly why is this so
*ptr = ++(*ptr); //i think this is not UB but can't explain why 

I have looked at many SO posts describing UB for different pointer cases similar to the cases above, but still i am unable to explain why exactly(like using which point(s) from the standard we can prove that they will result in UB) they result in UB.

I am looking for explanations according to C++11(or C++14) but not C++17 and not Pre-C++11.

Jason
  • 36,170
  • 5
  • 26
  • 60
  • Which posts have you looked at? What have you found in the according standards you reference? – Ulrich Eckhardt Dec 29 '21 at 15:22
  • @UlrichEckhardt Like [this](https://stackoverflow.com/questions/4176328/undefined-behavior-and-sequence-points) for example. There are others too ofcourse. But i intentionally did not mention each individual posts in my question because i want a separate answer and want to keep my question short and to the point. – Jason Dec 29 '21 at 15:25
  • Great find! It even has a specific and detailed answer for C++11. I wonder which parts of the explanation are missing for you... – Ulrich Eckhardt Dec 29 '21 at 15:30
  • See [this](https://stackoverflow.com/questions/4176328/undefined-behavior-and-sequence-points) and [this](https://stackoverflow.com/questions/38501587/what-are-the-evaluation-order-guarantees-introduced-by-c17) – NathanOliver Dec 29 '21 at 15:35
  • @NathanOliver Yes i have already read those posts you linked. Still i want a separate answer. Thanks anyways. – Jason Dec 29 '21 at 15:40
  • if i'm not mistaken, `i = ++i;` is not UB since C++11. `*ptr = ++(*ptr);` is not UB too. `*ptr = (*ptr)++;` should be UB in C++11 but not C++17 according to https://en.cppreference.com/w/cpp/language/eval_order – Afshin Dec 29 '21 at 15:42
  • Why is it so important to write unreadable code? Even if it is nice to know... is it good practice to write such stuff? – Klaus Dec 29 '21 at 17:28
  • @Klaus It is not about readability. I want to know what is happening so that i could avoid writing code like this in the future. I never said that you(someone) must write code like this. I have asked this question out of curiosity more than anything else. If you can't appreciate that then there's nothing i could do to convince you of the same. PS: There is no readability issue with my code. – Jason Dec 29 '21 at 18:26
  • "There is no readability issue with my code" If I am the reader, I have no idea what happens or what should happen. So why someone should write such code. There is no need to do such things. In our code reviews we did not accept such things and if we get such code more often, we take a note on it for next HR review... sorry, but this kind of bug introduction is harmful! Yes, it is nice to know what happens... but it is fully academic. Only my two cents... – Klaus Dec 29 '21 at 18:48
  • @Klaus If you're saying that there is no need to write code like this, then i agree. But this doesn't mean that we should not know about what is going on behind the scenes(in this scenario). Yes this is *for extending my C++ knowledge*. Yes this is for academic purposes. Also, as you said people might accidentally write this type of code without knowing that there is UB or any kind of problem with their code. It is only when we ask such(these types of) questions that we can really understand the subject. – Jason Dec 29 '21 at 18:51
  • "that we should not know about what is going on behind the scenes" Thats my problem! There are so much corner cases and even related to the selected C++ version. So the only thing I remember for all such expressions: It's dangerous! :-) So we simply remove them all, immediately, without any further discussion as long we have an idea what the line should do... – Klaus Dec 29 '21 at 18:56
  • @Klaus You're completely missing the point. If we don't know what is happening behind the scenes then how can/will we know that whether or not something is dangerous. IMO it is better to deal with the fundamental problems first. What you're suggesting is that to "memorize" what happens in these cases. While what i am suggesting is that to know the cause/reason for something irrespective of the fact whether or not they are dangerous. – Jason Dec 29 '21 at 19:06
  • While it is easy to get UB by writing an equation with a bunch of `i++` and/or `++i`. I don't see how any of these can be UB. Say `++i = i;` is clearly UB. – ALX23z Dec 29 '21 at 21:16
  • @JasonLiam tools – Taekahn Dec 30 '21 at 01:12
  • @ALX23z I don't believe `++i = i` is in fact UB in C++17. C++17 specifies that the evaluation of `i` (the right-hand side) happens first; then that of `++i`, including the side effect; and finally the assignment. In the end, the expression is well defined to be a no-op, equivalent to `i=i` – Igor Tandetnik Dec 30 '21 at 23:54
  • @IgorTandetnik well, the discussion is about C++11/C++14 as per OP request. – ALX23z Dec 31 '21 at 08:19

2 Answers2

2

Undefined behavior stems from this:

C++11 [intro.execution]/15 Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced... If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.

C++17 [intro.execution]/17 Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced... If a side effect on a memory location (4.4) is unsequenced relative to either another side effect on the same memory location or a value computation using the value of any object in the same memory location, and they are not potentially concurrent (4.7), the behavior is undefined.

This text is similar. The main difference lies in "except where noted" part; in C++17, the order of evaluation of operands is specified for more operators than in C++11. Thus:

C++17 [expr.ass]/1 In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression. The right operand is sequenced before the left operand.

C++11 lacks the bolded part. This part is what makes i = i++ well-defined in C++17, but undefined in C++11. That's because for postfix increment, the side effect is not part of a value computation of the expression:

C++11 and C++17 [expr.post.incr]/1 The value computation of the ++ expression is sequenced before the modification of the operand object.

So "the assignment is sequenced after the value computation of the right and left operands" is not by itself sufficient: the assignment is sequenced after the value computation of i++, and the side effect is also sequenced after that same value computation, but nothing says how they are sequenced relative to each other. Therefore, they are unsequenced, and they are both modifying the same object (here, i). This exhibits undefined behavior.

The addition of "the right operand is sequenced before the left operand" in C++17 means that the side effect of i++ is sequenced before the value computation of i, and both are sequenced before the assignment.


On the other hand, for pre-increment the side effect is necessarily part of the evaluation of the expression:

C++11 and C++17 [expr.pre.incr]/1 ... The result is the updated operand; it is an lvalue ...

So the value computation of ++i involves incrementing i first, and then applying an lvalue-to-rvalue conversion to obtain the updated value. This value computation is sequenced before the assignment in both C++11 and C++17, and so the two side effects on i are sequenced relative to each other; no undefined behavior.


Nothing changes in this analysis if i is replaced with (*ptr). That's just another way to refer to the same object or memory location.

Igor Tandetnik
  • 50,461
  • 4
  • 56
  • 85
  • Wow, good analysis. I am making sure that i get everything correctly. Meanwhile +1 from me. – Jason Dec 31 '21 at 04:23
0

The C++ Standard is based upon the C Standard, whose authors didn't need any particular "reason" to say that implementations may process a construct in whatever fashion would be most useful to their customers [which is what they intended the phrase "Undefined Behavior" to mean]. Many platforms can cheaply guarantee, for small primitive types, that race conditions involving a read and conflicting write to the same object will always yield either old or new data, and that race conditions involving conflicting writes will result every individual subsequent read seeing one of the written values. Rather than trying to identify all of the cases where implementations should or should not be expected to uphold guarantee, the Standard allows implementations to, at their leisure, process code "in a documented manner characteristic of the environment". Because it's not practical for all implementations to offer such guarantees in all possible scenarios, and because the range of scenarios where such guarantees would be practical would be different on different platforms, the authors of the Standard allowed implementations to weigh the pros and cons of offering various behavioral guarantees on their particular target platforms, rather than trying to write precise rules that would be appropriate for all possible implementations.

Note also that if one were to do something like:

*p = (*q)++;
return q[0] + q[i]; // where 'i' is some object of type `int`.

when p and q are equal and i is zero, a compiler might quite plausibly generate code where the assignment would undo the effect of the increment, but which would return the sum of the old value of q, plus 1, plus the actual stored value of q (which would be the old value, rather than the incremented value). Although this would be a logical consequence of the specified race-condition semantics, trying to specify it precisely would have been sufficiently awkward that the Standard simply allows implementations to specify the behavior as tightly or loosely as they see fit.

supercat
  • 77,689
  • 9
  • 166
  • 211