18

In the C and C++ languages, the arr[i] = i++; statement invokes undefined behavior. Why does the statement i = i + 1; not invoke undefined behavior?

Jayesh
  • 4,755
  • 9
  • 32
  • 62
  • 5
    Possible duplicate of [Undefined behavior and sequence points](https://stackoverflow.com/questions/4176328/undefined-behavior-and-sequence-points) – underscore_d Jun 03 '17 at 23:06
  • 5
    When asking questions like this, you need to focus on a *single language*, not "C and C++" any more than you'd ask about "C# and Java". – Cody Gray - on strike Jun 04 '17 at 04:43

5 Answers5

36

Since this was originally tagged with and and not any specific version(s), the below answer is a generic answer to the problem. However, please note for , C++17 onwards, the behaviour has changed. Please see this answer by Barry to know more.


For the statement

arr[i] = i++;

the value of i is used in both the operands, RHS(right-hand-side) and LHS(left-hand-side), and in one of the cases, the value is being modified (as a side effect of post ++) where there's no sequence point in between to determine which value of i should be considered. You can also check this canonical answer for more on this.

On the other hand, for i = i + 1, the value of i is used only in RHS, the computed result is stored in LHS, in other words, there's no ambiguity. We can write the same statement as i++, which

  • reads the value of i
  • Increments it by 1
  • stores it back to i

in a well-defined sequence. Hence, no issues.

Barry
  • 286,269
  • 29
  • 621
  • 977
Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261
  • 4
    Nice answer to a non-duplicate variant of this corner of the language. – Bathsheba Jun 03 '17 at 07:18
  • 7
    It's been a few years C++ abandoned "sequenced points" in favor of a more convoluted description based on "sequenced after" that made for example `i=++i;` legal code. – 6502 Jun 03 '17 at 07:23
  • 5
    `i = i + 1` is not equivalent to `i++`. It is equivalent in effect to `++i`. Compare the effect of `j = i = i+1` with `j = ++i` and `j = i++` to see the distinction (remembering that assignment operators are right to left associative, so `j = i = i+1` is equivalent to `j = (i = i+1)`). – Peter Jun 03 '17 at 07:59
  • @Peter well, you may be right, but my answer was pointed to the exact statement mentioned, so I guess in that, it does not make any difference, does it? – Sourav Ghosh Jun 03 '17 at 08:28
  • 1
    @Bathsheba Thank you sir for the kind words. :) – Sourav Ghosh Jun 03 '17 at 08:29
  • 4
    @SouravGhosh - my problem is that you wrote your second para in a way that suggests `i = i+1` and `i++` are equivalent, when they are not. Making that suggestion is unnecessary to answering the question, and is also misleading. – Peter Jun 03 '17 at 12:50
  • Note that this changes in C++17, where `arr[i] = i++` is [actually well-defined](http://en.cppreference.com/w/cpp/language/eval_order) (it gets the current value of `i`, increments `i`, and then sets `arr[i]` to the old value of `i`. Thus, it is equivalent to `{auto tmp = i; i++; arr[i] = tmp;}`). – Daniel H Jun 05 '17 at 15:18
  • @DanielH Thanks for the reminder.. I guess the C++17 was not explicitly mentioned and I am no expert in C++ either, so from that point, my answer was incomplete. Now there's an answer from Barry which complements mine, and I'm just now adding a link of that to mine. Hope that will be helpful. :) – Sourav Ghosh Jun 05 '17 at 15:24
13

Note that this will change in C++17. In C++17, arr[i] = i++ does not invoke undefined behavior. This is due to the following change in [expr.ass]:

In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression. The right operand is sequenced before the left operand.

That is, we do i++ then we do arr[i] then we perform the assignment. The now well-defined ordering is:

auto src = i++;
auto& dst = arr[i];
dst = src;
Barry
  • 286,269
  • 29
  • 621
  • 977
  • 2
    Even though this now works, it is still a horrendously brittle expression. Fortunately, if you are using GCC, you can keep this garbage expression undefined by using `-fno-strong-eval-order`. – KevinZ Jun 04 '17 at 00:04
  • 4
    @KevinZ why fortunately? What's the benefit of having extra undefined behavior in this case? – Ruslan Jun 04 '17 at 08:21
  • 2
    Yeah, it's bad to write such an expression, purely on account of how obfuscated and faux-clever it looks - but it is worse to leave a timebomb in the code and force your compiler to avoid complying with the new Standard. Writing good code should be encouraged by reasoning, not discouraged by UB... _you hope, if the UB ever detectably manifests before it's too late_. In fact, frankly, advising people to rely on something being UB, and observably so, is incredibly stupid. People who would write this code need to be explained out of it; they're highly unlikely to be running with UBsan or whatever – underscore_d Jun 04 '17 at 11:24
  • @Ruslan So that GCC will not unnecessarily pessimize on ilp on my perfectly sane code. More fundamentally, the act of storing to a memory location twice is a stupid idea in it very essence. By defining its semantics, C++17 has essentially encouraged it. It's much easier to deduce that double store is stupid and illegal than to remember any sequence of broken broken semantics regarding it misuse. – KevinZ Jun 04 '17 at 14:26
  • @KevinZ How is `arr[i] = i++` a double store, unless there are additionally aliasing issues which haven’t been mentioned? – Daniel H Jun 05 '17 at 15:22
  • @barry sir, instead of copying the content, I've just mentioned a link to your answer with due credits (_I believe, please let me know if any modification needed_) in mine, hope that's okay with you. Cheers!! – Sourav Ghosh Jun 05 '17 at 15:34
  • @DanielH You are right; it's meant store+read that does not emit a clear ordering. – KevinZ Jun 05 '17 at 23:22
9

For C99, we have:

6.5 Expressions

  1. Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.

In arr[i] = i++, the value of i is only modified once. But arr[i] also reads from i, and this value is not used to determine the new value of i. That's why it has undefined behavior.

On the other hand, in i = i + 1 we read i in order to compute i + 1, which is used as the new value of i. Therefore this expression is fine.

Community
  • 1
  • 1
melpomene
  • 84,125
  • 8
  • 85
  • 148
8
arr[i] = i++;

implies that

  • right hand expression is evaluated before assignment
  • subscript operator is evaluated before assignment

but contains ambiguity regarding the order of right hand expression evaluation and subscript operator evaluation, compiler is free to treat it as

auto & val{arr[i]};
i++;
auto const rval{i};
val = rval;

or as

i++;
auto & val{arr[i]};
auto const rval{i};
val = rval;

or as (same result as above)

i++;
auto const rval{i};
auto & val{arr[i]};
val = rval;

Which may produce unpredictable result, while

i = i + 1;

dos not have any ambiguity, right hand expression is evaluated before assignment:

auto const rval{i + 1};
auto & val{i};
val = rval;

or (same result as above)

auto & val{i};
auto const rval{i + 1};
val = rval;
user7860670
  • 35,849
  • 4
  • 58
  • 84
2

In your example a [i] = i++, if i = 3 for example, do you think a [i] is evaluated first, or i++? In one case, the value 3 would be stored in a [3], in the other case, it would be stored in a [4]. It's obvious that we have a problem here. No sane person would dare writing that code unless they found a guarantee what exactly will happen here. (Java gives that guarantee).

What would you think could be a problem with i = i + 1? The language must read i first to calculate i+1, then store that result. There is nothing here that could be wrong. Same with a [i] = i+1. Evaluating i+1, unlike i++, doesn't change i. So if i = 3, the number 4 must be stored in a [3].

Various languages have various rules to fix the problem with a [i] = i++. Java defines what happens: Expressions are evaluated left to right including their side effects. C defines it as undefined behaviour. C++ doesn't make it undefined behaviour but just unspecified. It says that either a[i] or i++ is evaluated first, and the other one next, but it doesn't say which one. So unlike C where anything can happen, C++ defines that only one of two things can happen. Obviously that's one thing too many to be acceptable in your code.

gnasher729
  • 51,477
  • 5
  • 75
  • 98