2

Reading an interesting article on ACCU's overload #115: "Demons May Fly Out Of Your Nose" I found the author saying:

between sequence points you are not allowed to make any assumptions about the state of involved variables. This also means that in C, unlike most other languages, the following expression leads to undefined behaviour

v[i] = i++;

because the assignment operator does not represent a sequence point in C

Can someone explain what's the detailed reasoning that implies UB here? I thought it would be a matter of having more than one write to the same variable between two sequence point, which I cannot see here except for the possibility of v[i] aliasing i...

Community
  • 1
  • 1
abigagli
  • 2,769
  • 4
  • 29
  • 32
  • 1
    This may help you. http://stackoverflow.com/questions/13744507/why-is-i-vi-undefined – Shashwat Kumar Jul 03 '13 at 21:33
  • possible duplicate of [Could anyone explain these undefined behaviors (i = i++ + ++i , i = i++, etc...)](http://stackoverflow.com/questions/949433/could-anyone-explain-these-undefined-behaviors-i-i-i-i-i-etc) – Pascal Cuoq Jul 03 '13 at 21:36
  • The paragraph explains why, "between sequence points you are not allowed to make any assumptions about the state of involved variables", and `v[i]` assumes `i` will have some particular meaningful value. – David Schwartz Jul 03 '13 at 21:42
  • 1
    "I thought it would be a matter of having more than one write" -- Why would you think that? It clearly is a matter of writing *and* reading, with the value read differing depending on the order. – Jim Balter Jul 03 '13 at 21:43
  • 2
    @David Schwartz: "Between sequence points you are not allowed to make any assumptions about the state of involved variables" should probably apply only to the variables that are *modified* between these sequence points. We are allowed to make assumptions about the sate of non-modified variables. Otherwise, nothing will ever work. – AnT stands with Russia Jul 04 '13 at 18:04

3 Answers3

5

It is not just about writes. Reads also play a role in situations when you don't know whether you are reading the old value of the variable or the new value. In this case there's no way to say whether v[i] refers to the old value of i or to the new value of i.

For example, the expression v[i] = i++ can be interpreted as

  1. Perform assignment: v[i] = i
  2. Increment i

Or, alternatively, it can be interpreted as

  1. Get the old value of i: i_old = i
  2. Increment i
  3. Perform assignment: v[i] = i_old

As you can easily see, the behavior of the code changes depending on how it is interpreted. These are just examples of two possible inconsistent scenarios.

But the language does not restrict the behavior to such limited variety of scenarios. Instead the language says that the behavior is undefined, which means, among other things, that the compiler is free to refuse to interpret this code in any "predictable" way.

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
  • The example is really useful and I'd love to be able to mark more than one as the accepted answer as this is as good as Praetorian's, but I'm choosing his because the explicit reference to the additional requirement "that the variable's value be only accessed to determine the value to be modified" is what gave me the specific reasoning I was looking for. Thanks a lot – abigagli Jul 04 '13 at 09:48
  • @abigagli: No problem, just keep in mind that proper wording is actually "only accessed to determine the value to be *stored*". Also, see my comment to Praetorian's answer for some extra details. – AnT stands with Russia Jul 04 '13 at 18:00
3

There is a read and a write of i within the same sequence point. That is, i++ could mutate i before OR after computing the address of v[i]. In fact, because this behaviour is undefined, it could very well do neither of those things -- the compiler could, for example, assume the statement to be unreachable and remove it. (Or make demons fly out your nose, etc.)

Tavian Barnes
  • 12,477
  • 4
  • 45
  • 118
3

You're correct that modifying a variable more than once without an intervening sequence point leads to undefined behavior. But there is an additional requirement that the variable's value be only accessed to determine the value to be modified. It's the second requirement that disallows

v[i] = i++;

Here the access of i's value for array indexing has nothing to do with the modification being made to i (i++). Since there is no intervening sequence point between the two accesses, this is undefined behavior.

Praetorian
  • 106,671
  • 19
  • 240
  • 328
  • 1
    Actually, the formal requirement is "the prior value shall be read only to determine the value to be *stored*". *Stored*, not *modified*. And this requirement is, unfortunately, subject to interpretation in some more tricky cases. For example, it is not immediately clear whether expressions like `a[a[i]] = i` are supposed to be defined or undefined if the initial value of array `a` is `{0, 1, 2, 3, 4 ... }`. This is one of the reasons the sequencing rules were completely redesigned in the latest C and C++ specs (C11, C++11). – AnT stands with Russia Jul 04 '13 at 17:59
  • @AndreyT Looking specifically at the C++11 standard, I think the most relevant clause for my case is 1.8.15. In particular "If a side effect on a scalar object is unsequenced relative to [...] a value computation using the value of the same scalar object, the behavior is undefined". I'm interpreting it with "i++" as the "side effect on a scalar object" and "v[i]" as the "value computation using the vale of the same scalr object". – abigagli Jul 04 '13 at 21:11