5

I understand that C uses the notion of sequence points to identify ambiguous computations, and that = operator is not a sequence point. However, I am unable to see any ambiguity in executing the statement

i = ++i

As per my understanding, this simply amounts to evaluating whatever is at &i, incrementing it, and storing it back at the same location. Yet, GCC flags it as under:

[Warning] operation on 'i' may be undefined [-Wsequence-point]

Am I missing something about how = functions ?

EDIT : Before marking as duplicate, please note that I have browsed other posts about sequence points and undefined behavior. None of them addresses the expression i=++i (note the pre-increment) specifically. Expressions mentioned are generally i=i++, a=b++ + ++b, etc. And I have no doubts regarding any of them.

Qurious
  • 69
  • 5
  • I think you're missing something about how the side effect of `++` functions. – Fred Larson Nov 12 '14 at 19:03
  • Since its pre-increment, the value of 'i' used should be the one after incrementing. – Qurious Nov 12 '14 at 19:07
  • 1
    The result returned will be the incremented value of `i`. But `i` may get incremented after the assignment. The standard specifies that side effect will occur before the next sequence point, but that may be before or after the assignment. – Fred Larson Nov 12 '14 at 19:12
  • 1
    Your confusion is probably because you are not aware of sequence points. See [here](http://www.parashift.com/c++-faq/sequence-points.html) and see this question: http://stackoverflow.com/questions/4176328/undefined-behavior-and-sequence-points – ldog Nov 12 '14 at 19:15
  • @FredLarson, if `i` may get incremented after assignment, something similar might happen in `a=++b`, and `a` might get the value of `b` _before_ the increment. But that doesn't happen, even though `=` is not a sequence point in either case. Why? – Qurious Nov 12 '14 at 19:27
  • @Idog, I am aware of sequence points. I have stated that specifically at the beginning of my question. – Qurious Nov 12 '14 at 19:28
  • @chrk, I have checked that post, and many others. I know that `i=i++` or `i=i++ + ++i` are undefined and can pin-point the reason of their ambiguity. I simply cannot understand what is ambiguous in `i=++i`. Notice the _pre_-increment. – Qurious Nov 12 '14 at 19:33
  • 2
    No, you misunderstand. I'm not saying the value returned by `++i` could be different. I'm saying it will always return `i+1`. But then the increment side-effect could occur either before or after the assignment. Where the side effect occurs in `a = ++b;' makes no difference because there is no self-assignment. – Fred Larson Nov 12 '14 at 19:40
  • 1
    @FredLarson So you're saying that _returning_ the value of `i+1` and _modifying_ what's at `&i` are independent events ? So returning the computed value is the actual operation, and incrementing the value at the memory location of `i`is the side effect of using `++` operator ? – Qurious Nov 12 '14 at 19:52
  • Yes, that's it exactly. – Fred Larson Nov 12 '14 at 19:54
  • Makes perfect sense now. Thanks !!. I always thought of increment as an atomic operation of _modify_ _variable_ _and_ _return_ _value_. So the confusion was mainly about 'side effects'. – Qurious Nov 12 '14 at 19:58

2 Answers2

7

You are missing something about undefined behavior. Undefined behavior simply means the compiler can do whatever it wants. It can throw an error, it can (as GCC does) show a warning, it can cause demons to fly out of your nose. The primary thing is, it won't behave well and it won't behave consistently between compilers, so don't do it!

In this case, the compiler does NOT have to make the guarentee that the side effects of the lhs of the operator must be completed before the rhs of the statement is returned. This seems funny to you but you don't think like a computer. It could, if it wants, calculate the return value and return it in a register, assign it to i, and then perform the increment on the actual value. So it would look more like

register=i+1;
i=register;
i=i+1;

The standard gives you no guarantee that this doesn't happen, so just don't do it!

Federico klez Culloca
  • 26,308
  • 17
  • 56
  • 95
IdeaHat
  • 7,641
  • 1
  • 22
  • 53
  • 3
    The question is not what is UB, the question is *why* is it UB? – didierc Nov 12 '14 at 19:09
  • @didierc Edited to answer that question. – IdeaHat Nov 12 '14 at 19:10
  • @IdeaHat The increment will be done before its value will be assigned to lhs. – Vlad from Moscow Nov 12 '14 at 19:24
  • 1
    good answer, however it seems to me that side effect strictly means updating a memory cell. Doing the increment at the same time is performing computation as well (IMHO). I think that confusion comes from that precise point. – didierc Nov 12 '14 at 19:25
  • @IdeaHat Well yes, it could store intermediate results in registers, but why add 1 to `i` twice ? It should add only once, either before or after assignment. And since its _pre_-increment, `i` should be incremented before its value is fetched for any use in that expression. – Qurious Nov 12 '14 at 19:38
  • 1
    Yes: the side-effect is "updating a memory cell" . But the method for "updating a memory cell" is not defined/dictated in the standard; everything that has the same effect will be allowed to accomplish it (most implementations will need to fetch the value into a register,increment the value in the register and write it back to "the cell" And the moment for "writing it back" is not fixed; the only requerement is that it should be completed before the next sequence point (the `;` ), and that is the reason why modifying a value more than once between sequence pointts is forbidden (or leads to UB) – wildplasser Nov 12 '14 at 19:55
  • @Qurious Because computers are weird. Its not that it *will* it is just that the standard doesn't say anything that says it *can't*, which is a very important distinction. – IdeaHat Nov 12 '14 at 19:56
  • Got it now. Confusion was arising from what the _side_ _effect_ part of the `++` operation was. I kept thinking increment meant to 'update memory location then return value'. That _updation_ was a side effect was not clear to me. Thanks for all your efforts ! – Qurious Nov 12 '14 at 20:02
  • @didierc A side effect doesn't strictly mean that at all. It means that the semantic value of i changes after this point. But as long as the semantic results are the same, the compiler is free to do whatever it wants. The fact of the matter is that the instruction to increment `i` could not appear until just before the next time it is used. In fact, if you do `i++;` on its own and it can be guarenteed to never used again (goes out of scope), a good compiler will probably just edit out that instruction all together, side effect or no! – IdeaHat Nov 12 '14 at 20:05
3

The undefined behavior arises because the variable i is modified more than once between two sequence points. Sequence points are points after which all side effects of previous evaluations are visible, but no future side effects are visible. The standard states:

Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.

So, what are the side effects that we are concerned about?

  • ++i, which assigns to i the value i+1
  • i = ++i, which assigns to i the value of the expression ++i, which is i+1

So, we are going to get two (admittedly, equivalent) side effects: assigning i+1 to the variable i. What we're concerned about is, between which two sequence points do these side effects occur?

What operations constitute sequence points? There are multiple, but there is only one that is actually relevant here:

  • at the end of a full expression (in this case, i = ++i is a full expression)

Namely, the pre-increment ++i is not a sequence point. Which means that both side effects (the increment, and the assignment) will occur between the same two sequence points, modifying the same variable i. Thus, it is undefined behavior; the fact that both modifications happen to have the same value is inconsequential.


But why is it bad to modify a variable multiple times between sequence points? To prevent things like:

i = ++i + 1;

Here, i is incremented, but then it is also assigned the value (i+1) + 1, due to the semantics of the pre-increment. Since the side effects have an ambiguous ordering, the behavior is undefined.

Now, there could hypothetically be a special case made in the standard that multiple modification between two sequence points is OK as long as the values are the same, but this would likely needlessly complicate compiler implementations, without much benefit.

voithos
  • 68,482
  • 12
  • 101
  • 116