7

Possible Duplicate:
Unsequenced value computations (a.k.a sequence points)
Undefined Behavior and Sequence Points
Operator Precedence vs Order of Evaluation

I'm still trying to wrap my head around how the following expression results in undefined behavior:

a = a++;

Upon searching SO about this, I found the following question:

Difference between sequence points and operator precedence? 0_o

I read through all the answers but I still am having difficulty with the details. One of the answers describes the behavior of my above code example as ambiguous, in terms of how a is modified. For example, it could come down to either of these:

a=(a+1);a++;
a++;a=a;

What exactly makes a's modification ambiguous? Does this have to do with CPU instructions on different platforms, and how the optimizer can take advantage of the undefined behavior? In other words, it seems undefined because of the generated assembler?

I don't see a reason for the compiler to use a=(a+1);a++;, it just looks quirky and doesn't make much sense. What would possess the compiler to make it behave this way?

EDIT:

Just to be clear, I do understand what is happening, I just don't understand how it can be undefined when there are rules on operator precedence (which essentially defines the order of evaluation of the expression). Assignment happens last in this case, so a++ needs to be evaluated first, to determine the value to assign to a. So what I expect is that a is modified first, during the post-fix increment, but then yields a value to assign back to a (second modification). But the rules for operator precedence seem to make the behavior very clear to me, I fail to find where there is any "wiggle-room" for it to have undefined behavior.

Community
  • 1
  • 1
void.pointer
  • 24,859
  • 31
  • 132
  • 243
  • 3
    You are requesting to modify `a` *twice* without sequence point: Once in the assignment, and once as a side effect of the `++`. The standard doesn't specify what you mean. – Kerrek SB Sep 07 '12 at 15:37
  • 1
    @KerrekSB: I already knew `a` is being modified twice, what I'm asking is what exactly is undefined about it. "The standard doesn't specify what you mean" -- What? It specifies order of operations & operator precedence, it makes it very clear what the expression means. I can look at it and know what happens, in order. – void.pointer Sep 07 '12 at 15:41
  • 2
    Operator precedence merealy disambiguates between several possible ways of parsing an expression (think of `*p++`, which means `(*p)++`) it doesn't help determining the order in which modifications of an object take place. – jrok Sep 07 '12 at 15:48
  • 1
    YES guys, there are duplicates, but what good do the duplicates do for me if the answers there make no sense to me? My goal is to get different answers, to better my understanding. – void.pointer Sep 07 '12 at 15:51
  • @Robert - How is this important? You try to change the value of `a` twice in the same expression. Why would you want to do that? There is absolutely no practical use for this. – Bo Persson Sep 07 '12 at 15:52
  • @BoPersson I never said I use this in practice. This is an educational session, to better my understanding of the semantics of C++ as defined by the standard. – void.pointer Sep 07 '12 at 15:53
  • @Robert [This answer](http://stackoverflow.com/a/4177063/947836) helped me a great lot to understand this. – jrok Sep 07 '12 at 15:55
  • @Robert - But there is nothing to understand. :-) In the real world, there is no need to update `a` twice, so why do we care? `++a` will work fine, `a = a++ + ++a - a--` will not. Use the one that works! – Bo Persson Sep 07 '12 at 15:58
  • @Robert: "My goal is to get different answers" Then you use a *bounty*, saying that you don't understand the current answers on that question. – Nicol Bolas Sep 07 '12 at 15:59
  • 1
    I think you may be trying to over-analyse the situation. You have code that, by the rules of the language, has _undefined behavior_. At this point you can't apply any logic such as "if it didn't have _undefined behavior_; it could only have one meaning so it shouldn't be undefined". It is was it is. – CB Bailey Sep 07 '12 at 16:07
  • @CharlesBailey One of the greatest philosophies about science: Always question things. There is always a reason why something is undefined. When the committee decided to not make rules that would otherwise make it well defined, I want to know what scenarios were playing through their mind. – void.pointer Sep 07 '12 at 16:11
  • But a programming standard is not a natural phenomenon to which you can apply scientific method and reasoning. It is a set of rules devised by humans. – CB Bailey Sep 07 '12 at 16:13
  • @CharlesBailey And government and law are the same thing, but just because there are rules in place doesn't make them right. Just like I'd try to understand why something has a rule, I want to understand why something doesn't have a rule. You are misunderstanding me, you think I am trying to change the standard or wish it was different. No. I'm trying to understand why it is undefined. Please be clear on this. – void.pointer Sep 07 '12 at 16:14
  • OK, but when I replied to your comment it merely read: "@CharlesBailey One of the greatest philosophies about science: Always question things.". I hope my reply makes more sense in this context. – CB Bailey Sep 07 '12 at 16:20
  • @CharlesBailey Programming is still a science, and science doesn't always mean something has to be a natural phenomenon. However C++ is something you certainly can apply scientific method and reasoning, and the standard is proof of this (it is a direct result of such). Also the standard isn't a fixed entity, it evolves (example: C++11) due to scientific questioning (e.g. "How can we make this better?") -- Getting off topic now, but that's what I meant. – void.pointer Sep 07 '12 at 16:23
  • @RobertDailey: If you look at my reply to the question I nominated as a duplicate, it gives examples that are quite specific and (IMO) pretty reasonable for how you'd end up with problematic behavior from the kind of code you've discussed. If you've already looked at that and find my reasoning hard to follow (or far-fetched), I'd appreciate hearing about it. I have no problem at all with doing some editing if it'll clarify the answer. – Jerry Coffin Sep 07 '12 at 17:12

6 Answers6

8

The first answer in the question you linked to explains exactly what's going on. I'll try to rephrase it to make it more clear.

Operator precedence defines the order of the computation of values via expressions. The result of the expression (a++) is well understood.

However, the modification of the variable a is not part of the expression. Yes, really. This is the part you're having trouble understanding, but that's simply how C and C++ define it.

Expressions result in values, but some expressions can have side effects. The expression a = 1 has a value of 1, but it also has the side effect of setting the variable a to 1. As far as how C and C++ define things, these are two different steps. Similarly, a++ has a value and a side-effect.

Sequence points define when side effects are visible to expressions that are evaluated after those sequence points. Operator precedence has nothing to do with sequence points. That's just how C/C++ defines things.

Community
  • 1
  • 1
Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • The separation of the processing of side effects and the evaluation of the expression is definitely what I'm having trouble with, because I just have a bad feeling that there is a situation out there where the value of the side-effect is dependent on the outcome of the expression, in which case the side-effects MUST be processed prior to the complete evaluation of the expression. Side effects are simply a modification of a block of memory somewhere. What if the same expression operates on that block of memory? Can't think of any examples off-hand that would do this, though. – void.pointer Sep 07 '12 at 15:55
  • @RobertDailey: That's the point: if you write an expression, where the value computed by that expression is based on side effects that are not properly sequenced, you get undefined behavior. It's a two layer system: you need operator precedence, but the source values need to be sequenced to any side effects from prior expressions. – Nicol Bolas Sep 07 '12 at 15:57
  • @Robert "What if the same expression operates on that block of memory". I think that's the point. C and C++ (for better of worse) don't define the *order* in which side effects take place, but they say *when* those side effects must be *complete* (by the next sequence point prior to C++11, before the next "sequenced after" operation in C++11). Hence the rule "don't modify something twice between two sequence points" to prevent ambiguities. – jrok Sep 07 '12 at 16:01
  • Why make it undefined though? That's my point. `a = a++`, why not require that side-effects are processed on-demand, as soon as evaluation reaches that point? Modify `a` to equal 1, then return 0, then finally assign 0 to `a`. I find no reason to make the behavior undefined. What were they thinking when they set it up this way in the standard? – void.pointer Sep 07 '12 at 16:05
  • @RobertDailey: *How* it is undefined behavior is a different question from *why* it is. The latter is essentially speculative and probably has to do with giving compilers reasonable freedom of action and making their jobs easier. Or it may be some historical issue. – Nicol Bolas Sep 07 '12 at 16:14
  • @RobertDailey: Or maybe the spec writers were wise enough to know that endorsing code like `a = a++ + ++a` is a horrible idea ;) – Nicol Bolas Sep 07 '12 at 16:16
  • @NicolBolas "horrible idea", is that your personal opinion? Where are the facts? :) I can't think of a use case that would make it horrible, or destructive, or whatever else. But, horrible in terms of readability and the fact that it doesn't make much sense to do so, I agree. – void.pointer Sep 07 '12 at 16:18
1

This is probably too simplistic an explanation, but I think it is because there's no way to resolve when the code is "done" with "a". Is it done after the increment, or the assignment? The resolution ends up being circular. Assignment after the increment changes the semantics of when the incremented value is applied. That is, the code isn't done with "a" until "a" gets incremented, but a doesn't get incremented until the after the assignment is made. It's almost a language version of a deadlock.

As I said, I'm sure that's not a great "academic" explanation, but that's how I bottle it up between my own ears. Hope that's somehow helpful.

David W
  • 10,062
  • 34
  • 60
1

The precedence rules specify the order in which expressions are evaluated, but side effects do not have to happen during evaluation. They can be happen at any time before the next sequence point.

In this case, the side effect of the increment is sequenced neither before nor after the assignment, so the expression has undefined behaviour.

Mike Seymour
  • 249,747
  • 28
  • 448
  • 644
0

The point here is that on some CPU architectures, like Intel Itanium, these two operations could be parallelized by the compiler at the instruction level- but making your construct well-defined would forbid that. At the time of sequence point specification, these architectures were mostly hypothetical, and since Itanium was a flop, it's well arguable that as of 2012, much of this is unnecessary complexity in the language. There's basically no possible disadvantage on any architecture still in common use- and even for Itanium, the performance advantage was minimal and the headache of writing a compiler that could even take advantage of it was huge.

Note also that in C++11, sequence points were replaced with sequenced before and sequenced after, which made more situations like this well-defined.

Puppy
  • 144,682
  • 38
  • 256
  • 465
  • Even on modern machines, a possible issue is that if `a` and `b` are references that identify the same `int`, and `x` and `y` are `int` variables, code like `a=b++; x=a; ...; y=b;` may be evaluated as `temp=b; a=temp; b=temp+1; x=temp; ... y=a;`, behaving as though `a` spontaneously changes between the assignments to `x` and `y`. I'd like to see the Standard define an execution model which would recognize contagious non-determinism but stay on the rails, since the marginal benefits from allowing arbitrary behaviors beyond that are slight, but there would be usefulness to limiting... – supercat May 21 '16 at 16:58
  • ...the consequences of overlapping assignments in cases where it doesn't matter what values get computed in certain edge cases provided that code stays on the rails. Requiring programmers to guard against such assignments even in those cases could make code less efficient than letting them yield arbitrary results. – supercat May 21 '16 at 17:02
0

That statement a=a++ has two results, and two assignments:

a=a

(because its a postincrement) And

a=a+1

These assignments clearly result in a different final value for a.

The drafters of the c standard did not indicate which of the two assignments should be written to a first and which second, so compiler writers are free to choose whichever they like, in any given situation.

The upshot is, it (this particular statement) won't crash, but your program can't rely on a having a specific value any more.

Alex Brown
  • 41,819
  • 10
  • 94
  • 108
0

Let me run through the basic problem in the statement a = a++. We want to achieve all of the following things:

  • determine the value a (the return value of a++, #1)

  • increment a (the side effect of a++, #2)

  • assign the old value to a (effect of the assignment, #3)

There are two possible ways this could be sequenced:

  1. Store the original a into a (a no-op); then increment a. Same as a = a; ++a;. This is sequence #1-#3-#2.

  2. Evaluate a, increment a, assign the original value back to a. Same as b = a; ++a; a = b;. This is sequence #1-#2-#3.

Since there is no prescribed sequence, either of those operations are permissible. But they have a different ultimate result. Neither sequence is more natural than the other.

Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084