44

From the C++ (C++11) standard, §1.9.15 which discusses ordering of evaluation, is the following code example:

void g(int i, int* v) {
    i = v[i++]; // the behavior is undefined
}

As noted in the code sample, the behavior is undefined.

(Note: The answer to another question with the slightly different construct i + i++, Why is a = i + i++ undefined and not unspecified behaviour, might apply here: The answer is essentially that the behavior is undefined for historical reasons, and not out of necessity. However, the standard seems to imply some justification for this being undefined - see quote immediately below. Also, that linked question indicates agreement that the behavior should be unspecified, whereas in this question I am asking why the behavior is not well-specified.)

The reasoning given by the standard for the undefined behavior is as follows:

If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.

In this example I would think that the subexpression i++ would be completely evaluated before the subexpression v[...] is evaluated, and that the result of evaluation of the subexpression is i (before the increment), but that the value of i is the incremented value after that subexpression has been completely evaluated. I would think that at that point (after the subexpression i++ has been completely evaluated), the evaluation v[...] takes place, followed by the assignment i = ....

Therefore, although the incrementing of i is pointless, I would nonetheless think that this should be defined.

Why is this undefined behavior?

Community
  • 1
  • 1
Dan Nissenbaum
  • 13,558
  • 21
  • 105
  • 181
  • I suspect it's undefined in C++ because it has always been undefined in C. – NPE Dec 06 '12 at 13:23
  • @NPE The standard *appears* to give some justification for this involving sequencing (as in the quote and related text from the standard), leading me to believe that the issue of sequencing has a rationale behind it, rather than just being historical. I guess that's part of what I'm asking. – Dan Nissenbaum Dec 06 '12 at 13:24
  • But what that quote appears to be talking about is sequence points, which again is a concept that goes back to C. Of course, I could be missing the point (pardon the pun). – NPE Dec 06 '12 at 13:27
  • 1
    hmm... Why a downvote? I always like to learn what I should do differently. – Dan Nissenbaum Dec 06 '12 at 13:36
  • For the record, I didn't downvote. I think it's a nice question. – NPE Dec 06 '12 at 13:37
  • There are two changes to i. It is not defined which change happens first. – brian beuning Dec 06 '12 at 13:45
  • @brianbeuning: it's not defined whether *either* of them happens *at all*. – Steve Jessop Dec 06 '12 at 13:56
  • 4
    Note that all classic "i++ goo" examples like this are a mix up of the _undefined_ behavior "assigning a value to a variable twice in an expression, without a sequence point in between", and the _unspecified_ behavior "order of evaluation of sub-expressions". Both apply at the same time. – Lundin Dec 06 '12 at 14:04
  • I'm surprise of this behavior, when you override operator++, you actually do a copy, increment the original and then return the copy. So, I was expecting something like: j = i; ++i; t = v[j]; i = t; – widgg Dec 06 '12 at 14:10
  • @widgg I was expecting something like that too, but as all of these outstanding answers and comments reveal, the reality is more... involved. – Dan Nissenbaum Dec 06 '12 at 14:12
  • @DanNissenbaum still, it's interesting to know! – widgg Dec 06 '12 at 17:20
  • the `++` here is totally pointless.. can you come up with an example where this is actually useful? – Karoly Horvath Dec 10 '12 at 19:05
  • @KarolyHorvath It's not intended to be useful. This is purely a theoretical question, intentionally chosen to be simple, and intended to help understand side effects, order of evaluation, and sequence points. Because it's undefined behavior, such a construct should never appear in a program, so I don't think it could ever be useful (except for some theoretical purpose in which undefined behavior is intended). – Dan Nissenbaum Dec 10 '12 at 20:56
  • Perhaps it's that I started back in pre-8080 days, or perhaps I've just aged into curmudgeonry, but I think it's abominable that such behavior is permitted to be undefined. I understand the advantages for optimization, but the disadvantages of unintended (and *obscure*) bugs, IM – Alan Jay Weiner Dec 11 '12 at 19:16
  • @Alan Jay Weiner: "overwhelm any advantages" - you're completely wrong. C/C++ is all about performance. if this scares you, just pick another language. – Karoly Horvath Dec 15 '12 at 16:53
  • @Karoly Horvath - Performance and C/C++ doesn't scare me at all - I've used C since the late 1970s. Undefined behaviors can't be used; can't be trusted to work correctly in the future even if it works today. Hmmm - Fast buggy code vs (maybe) slightly slower reliable code... Worse: bugs appear *years later* by upgrading compiler or computer? Defining behavior doesn't have to negatively affect performance. E.g. order of operations: allows predictable and reliable optimizations - runs *faster*. (btw, sorry to comment such an old thread - don't want to leave myself misunderstood.) – Alan Jay Weiner Jan 23 '18 at 04:29

8 Answers8

42

I would think that the subexpression i++ would be completely evaluated before the subexpression v[...] is evaluated

But why would you think that?

One historical reason for this code being UB is to allow compiler optimizations to move side-effects around anywhere between sequence points. The fewer sequence points, the more potential opportunities to optimize but the more confused programmers. If the code says:

a = v[i++];

The intention of the standard is that the code emitted can be:

a = v[i];
++i;

which might be two instructions where:

tmp = i;
++i;
a = v[tmp];

would be more than two.

The "optimized code" breaks when a is i, but the standard permits the optimization anyway, by saying that behavior of the original code is undefined when a is i.

The standard easily could say that i++ must be evaluated before the assignment as you suggest. Then the behavior would be fully defined and the optimization would be forbidden. But that's not how C and C++ do business.

Also beware that many examples raised in these discussions make it easier to tell that there's UB around than it is in general. This leads to people saying that it's "obvious" the behavior should be defined and the optimization forbidden. But consider:

void g(int *i, int* v, int *dst) {
    *dst = v[(*i)++];
}

The behavior of this function is defined when i != dst, and in that case you'd want all the optimization you can get (which is why C99 introduces restrict, to allow more optimizations than C89 or C++ do). In order to give you the optimization, behavior is undefined when i == dst. The C and C++ standards tread a fine line when it comes to aliasing, between undefined behavior that's not expected by the programmer, and forbidding desirable optimizations that fail in certain cases. The number of questions about it on SO suggests that the questioners would prefer a bit less optimization and a bit more defined behavior, but it's still not simple to draw the line.

Aside from whether the behavior is fully defined is the issue of whether it should be UB, or merely unspecified order of execution of certain well-defined operations corresponding to the sub-expressions. The reason C goes for UB is all to do with the idea of sequence points, and the fact that the compiler need not actually have a notion of the value of a modified object, until the next sequence point. So rather than constrain the optimizer by saying that "the" value changes at some unspecified point, the standard just says (to paraphrase): (1) any code that relies on the value of a modified object prior to the next sequence point, has UB; (2) any code that modifies a modified object has UB. Where a "modified object" is any object that would have been modified since the last sequence point in one or more of the legal orders of evaluation of the subexpressions.

Other languages (e.g. Java) go the whole way and completely define the order of expression side-effects, so there's definitely a case against C's approach. C++ just doesn't accept that case.

Steve Jessop
  • 273,490
  • 39
  • 460
  • 699
  • 4
    +1, also note that in modern cpus different operations have different costs and the compiler might do other reorderings that could lead to potentially more although faster operations. Without deep understanding on the processor architecture it is almost impossible to know what will be faster at this micro level, but that is the compiler's job. – David Rodríguez - dribeas Dec 06 '12 at 13:38
  • @David: good point, I used number of instructions as one example of why it's good to be allowed to optimize. There are many, many more reasons. – Steve Jessop Dec 06 '12 at 13:39
  • Thank you for the excellent answer. I wish I could accept multiple answers! – Dan Nissenbaum Dec 10 '12 at 18:02
30

I'm going to design a pathological computer1. It is a multi-core, high-latency, single-thread system with in-thread joins that operates with byte-level instructions. So you make a request for something to happen, then the computer runs (in its own "thread" or "task") a byte-level set of instructions, and a certain number of cycles later the operation is complete.

Meanwhile, the main thread of execution continues:

void foo(int v[], int i){
  i = v[i++];
}

becomes in pseudo-code:

input variable i // = 0x00000000
input variable v // = &[0xBAADF00D, 0xABABABABAB, 0x10101010]
task get_i_value: GET_VAR_VALUE<int>(i)
reg indx = WAIT(get_i_value)
task write_i++_back: WRITE(i, INC(indx))
task get_v_value: GET_VAR_VALUE<int*>(v)
reg arr = WAIT(get_v_value)
task get_v[i]_value = CALC(arr + sizeof(int)*indx)
reg pval = WAIT(get_v[i]_value)
task read_v[i]_value = LOAD_VALUE<int>(pval)
reg got_value = WAIT(read_v[i]_value)
task write_i_value_again = WRITE(i, got_value)
(discard, discard) = WAIT(write_i++_back, write_i_value_again)

So you'll notice that I didn't wait on write_i++_back until the very end, the same time as I was waiting on write_i_value_again (which value I loaded from v[]). And, in fact, those writes are the only writes back to memory.

Imagine if write to memory are the really slow part of this computer design, and they get batched up into a queue of things that get processed by a parallel memory modifying unit that does things on a per-byte basis.

So the write(i, 0x00000001) and write(i, 0xBAADF00D) execute unordered and in parallel. Each gets turned into byte-level writes, and they are randomly ordered.

We end up writing 0x00 then 0xBA to the high byte, then 0xAD and 0x00 to the next byte, then 0xF0 0x00 to the next byte, and finally 0x0D 0x01 to the low byte. The resulting value in i is 0xBA000001, which few would expect, yet would be a valid result to your undefined operation.

Now, all I did there was result in an unspecified value. We haven't crashed the system. But the compiler would be free to make it completely undefined -- maybe sending two such requests to the memory controller for the same address in the same batch of instructions actually crashes the system. That would still be a "valid" way to compile C++, and a "valid" execution environment.

Remember, this is a language where restricting the size of pointers to 8 bits is still a valid execution environment. C++ allows for compiling to rather wonkey targets.

1: As noted in @SteveJessop's comment below, the joke is that this pathological computer behaves a lot like a modern desktop computer, until you get down to the byte-level operations. Non-atomic int writing by a CPU isn't all that rare on some hardware (such as when the int isn't aligned the way the CPU wants it to be aligned).

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
  • 12
    "I'm going to design a pathological computer ... rather wonkey targets". And in case it isn't obvious, what Yakk describes wasn't far off a modern x86 or AMD64 processor until the writes were broken down into independent byte writes. Instruction pipelining occasionally makes a *huge* difference to performance, and sequence points constrain the compiler to wait for ops in the pipeline to complete. – Steve Jessop Dec 06 '12 at 14:14
  • This is an outstanding answer, thanks. I think the reason I asked the question to begin with is because I am trying to understand compilation-level (even assembly-level) details, even though many/most of those details are not specified by the standard ... yet, as this example shows, there is an interplay. – Dan Nissenbaum Dec 06 '12 at 14:30
  • 1
    @DanNissenbaum: As Steve already pointed out, modern high-speed CPU designs have these complex instruction pipelines which are invisible even at the assembly level (if only for backwards compatibility). What you _think_ is the `EAX` register is just a random register that momentarily has the label EAX. – MSalters Dec 07 '12 at 00:22
  • There are many excellent answers to this question. I appreciate it. This particular answer really rang home for me. Thanks. – Dan Nissenbaum Dec 10 '12 at 18:03
  • Years later and I still find this answer to be one of the most enlightening I've seen - it transformed my understanding of C++. It's now obvious to me why that construct is not defined. – Dan Nissenbaum Feb 12 '17 at 19:15
24

The reason is not just historical. Example:

int f(int& i0, int& i1) {
    return i0 + i1++;
}

Now, what happens with this call:

int i = 3;
int j = f(i, i);

It's certainly possible to put requirements on the code in f so that the result of this call is well defined (Java does this), but C and C++ don't impose constraints; this gives more freedom to optimizers.

Pete Becker
  • 74,985
  • 8
  • 76
  • 165
10

You specifically refer to the C++11 standard so I'm going to answer with the C++11 answer. It is, however, very similar to the C++03 answer, but the definition of sequencing is different.

C++11 defines a sequenced before relation between evaluations on a single thread. It is asymmetric, transitive and pair-wise. If some evaluation A is not sequenced before some evaluation B and B is also not sequenced before A, then the two evaluations are unsequenced.

Evaluating an expression includes both value computations (working out the value of some expression) and side effects. One instance of a side effect is the modification of an object, which is the most important one for answering question. Other things also count as side effects. If a side effect is unsequenced relative to another side effect or value computation on the same object, then your program has undefined behaviour.

So that's the set up. The first important rule is:

Every value computation and side effect associated with a full-expression is sequenced before every value computation and side effect associated with the next full-expression to be evaluated.

So any full expression is fully evaluated before the next full expression. In your question, we're only dealing with one full expression, namely i = v[i++], so we don't need to worry about this. The next important rule is:

Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced.

That means that in a + b, for example, the evaluation of a and b are unsequenced (they may be evaluated in any order). Now for our final important rule:

The value computations of the operands of an operator are sequenced before the value computation of the result of the operator.

So for a + b, the sequenced before relationships can be represented by a tree where a directed arrow represents the sequenced before relationship:

a + b (value computation)
^   ^
|   |
a   b (value computation)

If two evaluations occur in separate branches of the tree, they are unsequenced, so this tree shows that the evaluations of a and b are unsequenced relative to each other.

Now, let's do the same thing to your i = v[i++] example. We make use of the fact that v[i++] is defined to be equivalent to *(v + (i++)). We also use some extra knowledge about the sequencing of postfix increment:

The value computation of the ++ expression is sequenced before the modification of the operand object.

So here we go (a node of the tree is a value computation unless specified as a side effect):

i = v[i++]
^     ^
|     |
i★  v[i++] = *(v + (i++))
                  ^
                  |
               v + (i++)
               ^     ^
               |     |
               v     ++ (side effect on i)★
                     ^
                     |
                     i

Here you can see that the side effect on i, i++, is in a separate branch to the usage of i in front of the assignment operator (I marked each of these evaluations with a ★). So we definitely have undefined behaviour! I highly recommend drawing these diagrams if you ever wonder if your sequencing of evaluations is going to cause you trouble.

So now we get the question about the fact that the value of i before the assignment operator doesn't matter, because we write over it anyway. But actually, in the general case, that's not true. We can override the assignment operator and make use of the value of the object before the assignment. The standard doesn't care that we don't use that value - the rules are defined such that having any value computation unsequenced with a side effect will be undefined behaviour. No buts. This undefined behaviour is there to allow the compiler to emit more optimized code. If we add sequencing for the assignment operator, this optimization cannot be employed.

Joseph Mansfield
  • 108,238
  • 20
  • 242
  • 324
  • I believe this kind of reasoning isn't correct. Correct me if I'm wrong, but you could use the same rationale for the expression `i = v[++i];` concluding the expression would also be UB, but it is actually a well-defined code. I think the answers that were given on this post (https://stackoverflow.com/questions/10778627/why-is-i-i-1-undefined-behavior-in-c11) are more precise. – domdrag Nov 29 '22 at 18:30
4

In this example I would think that the subexpression i++ would be completely evaluated before the subexpression v[...] is evaluated, and that the result of evaluation of the subexpression is i (before the increment), but that the value of i is the incremented value after that subexpression has been completely evaluated.

The increment in i++ must be evaluated before indexing v and thus before assigning to i, but storing the value of that increment back to memory need not happen before. In the statement i = v[i++] there are two suboperations that modify i (i.e. will end up causing a store from a register into the variable i). The expression i++ is equivalent to x=i+1, i=x, and there is no requirement that both operations need to take place sequentially:

x = i+1;
y = v[i];
i = y;
i = x;

With that expansion, the result of i is unrelated to the value in v[i]. On a different expansion, the i = x assignment could take place before the i = y assignment, and the result would be i = v[i]

David Rodríguez - dribeas
  • 204,818
  • 23
  • 294
  • 489
4

There two rules.

The first rule is about multiple writes which give rise to a "write-write hazard": the same object cannot be modified more than once between two sequence points.

The second rule is about "read-write hazards". It is this: if an object is modified in an expression, and also accessed, then all accesses to its value must be for the purpose of computing the new value.

Expressions like i++ + i++ and your expression i = v[i++] violate the first rule. They modify an object twice.

An expression like i + i++ violates the second rule. The subexpression i on the left observes the value of a modified object, without being involved in the calculation of its new value.

So, i = v[i++] violates a different rule (bad write-write) from i + i++ (bad read-write).


The rules are too simplistic, which gives rise to classes of puzzling expressions. Consider this:

p = p->next = q

This appears to have a sane data flow dependency that is free of hazards: the assignment p = cannot take place until the new value is known. The new value is the result of p->next = q. The the value q should not "race ahead" and get inside p, such that p->next is affected.

Yet, this expression breaks the second rule: p is modified, and also used for a purpose not related to computing its new value, namely determining the storage location where the value of q is placed!

So, perversely, compilers are allowed to partially evaluate p->next = q to determine that the result is q, and store that into p, and then go back and complete the p->next = assignment. Or so it would seem.

A key issue here is, what is the value of an assignment expression? The C standard says that the value of an assignment expression is that of the lvalue, after the assignment. But that is ambiguous: it could be interpreted as meaning "the value which the lvalue will have, once the assignment takes place" or as "the value which can be observed in the lvalue after the assignment has taken place". In C++ this is made clear by the wording "[i]n all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression.", so p = p->next = q appears to be valid C++, but dubious C.

Kaz
  • 55,781
  • 9
  • 100
  • 149
  • Interesting to note that due to the wording of the C++ specification you've indicated, the guarantee that `p = p->next = q` works as expected comes when applying the rule to the SECOND assignment: The rhs (as well as lhs) of the second assignment operation (first equals sign) must have its side effects completed before performing the assignment, and this rhs value is the result of the completion of the first assignment operation (second equals sign). – Dan Nissenbaum Dec 10 '12 at 18:01
2

I would share your arguments if the example were v[++i], but since i++ modifies i as a side-effect, it is undefined as to when the value is modified. The standard could probably mandate a result one way or the other, but there's no true way of knowing what the value of i should be: (i + 1) or (v[i + 1]).

moswald
  • 11,491
  • 7
  • 52
  • 78
  • Don't all side effects of a subexpression have to be evaluated before any containing expression is evaluated? – Dan Nissenbaum Dec 06 '12 at 13:27
  • 1
    No, that's **order of evaluation**. The example is much simpler: it uses the value of `i` and modifies `i` without an intervening sequence point. Yes, it would be possible to define what that means in simple cases like this, but when you're deep inside a function that ends up doing something similar because of the arguments that were passed, it's much harder to see what a sensible rule might be. – Pete Becker Dec 06 '12 at 13:28
  • 1
    @DanNissenbaum - no, side effects have to be completed at the next sequence point. Sometimes a subexpression is bounded by two sequence points, but not necessarily. Silly example: `i++ && i` is well-defined, because `&&` introduces a sequence point. `i++ + i` is not; `+` does not introduce a sequence point. – Pete Becker Dec 06 '12 at 13:30
  • I guess I don't understand the difference between "sequence points" and "order of evaluation". I was unfamiliar with sequence points before today. Perhaps I need to study it (though I'm now looking at your second comment...) – Dan Nissenbaum Dec 06 '12 at 13:30
  • @DanNissenbaum - don't read anything deep into "sequence point". Some code constructs introduce sequence points because the language definition says they do. Similarly for the C++11 term "unsequenced", although the description is more complicated in order to make it clearly defined across multiple threads. – Pete Becker Dec 06 '12 at 13:33
  • @PeteBecker Your second comment explains this perfectly. (Also, I will now have a look in the C++ standard to see if there's one clear place that defines all operators that require a sequence point.) – Dan Nissenbaum Dec 06 '12 at 13:34
  • 3
    Note that even in the `++i` case, there are two operations: the increment (performed in a register) and the store back to `i`. The store is not sequenced with respect to the rest of the expression, so even if the increment is calculated before indexing the array, the store might happen after the assignment. – David Rodríguez - dribeas Dec 06 '12 at 13:36
  • @DavidRodríguez-dribeas Understood. I just find that one involving `++i` in particular to be counter-intuitive, and I'd have liked to make an argument to the standards committee way back when. :) – moswald Dec 06 '12 at 17:21
  • @moswald: A comment back in the day and a comment today are basically the same thing and will encounter the same issues. Changing that bit in the standard would be a backwards compatible change, but it would not be accepted on the grounds that it will have an impact on performance and whatnot... – David Rodríguez - dribeas Dec 06 '12 at 17:38
  • @DavidRodríguez-dribeas Point taken. – moswald Dec 06 '12 at 18:05
1

Think about the sequences of machine operations necessary for each of the following assignment statements, assuming the given declarations are in effect:

extern int *foo(void);
extern int *p;

*p = *foo();
*foo() = *p;

If the evaluation of the subscript on the left side and the value on the right side are unsequenced, the most efficient ways to process the two function calls would likely be something like:

[For *p = *foo()]
call foo (which yields result in r0 and trashes r1)
load r0 from address held in r0
load r1 from address held in p
store r0 to address held in r1

[For *foo() = *p]
call foo (which yields result in r0 and trashes r1)
load r1 from address held in p
load r1 from address held in r1
store r1 to address held in r0

In either case, if p or *p were read into a register before the call to foo, then unless "foo" promises not to disturb that register, the compiler would need to add an extra step to save its value before calling "foo", and another extra step to restore the value afterward. That extra step might be avoided by using a register that "foo" won't disturb, but that would only help if there were a such a register which didn't hold a value needed by the surrounding code.

Letting the compiler read the value of "p" before or after the function call, at its leisure, will allow both patterns above to be handled efficiently. Requiring that the address of the left-hand operand of "=" always be evaluated before the right hand side would likely make the first assignment above less efficient than it otherwise could be, and requiring that the address of the left-hand operand be evaluated after the right-hand side would make the second assignment less efficient.

supercat
  • 77,689
  • 9
  • 166
  • 211