4

I am confused about the output of this code:

int c=3;
cout<<(c++)*(c++);

I use gcc and the output is 9, but someone said that it's undefined behavior, why?

M.M
  • 138,810
  • 21
  • 208
  • 365
StackBox
  • 326
  • 1
  • 6
  • 8
    Well that is c++ not c – Greg Brown Apr 29 '12 at 04:19
  • Same result, same issue, either C or C++ – paulsm4 Apr 29 '12 at 04:29
  • 2
    @paulsm4 Not really. In C, this is very specific defined behaviour: you get a compiler error. – Mr Lister Apr 29 '12 at 08:53
  • 1
    @Mr. Lister: why is this a compiler error in C any more than in C++? While in C++ `cout` is typically declared by `` or a related header, in C (or C++ for that matter) `cout` could be declared to be an `int`. I guess what I'm saying is that the `cout<<` part of the example code isn't really important to the part that's an example of undefined behavior. – Michael Burr Apr 29 '12 at 22:08

5 Answers5

4

The issue is "Sequence points":

http://en.wikipedia.org/wiki/Sequence_point

A sequence point in imperative programming defines any point in a computer program's execution at which it is guaranteed that all side effects of previous evaluations will have been performed, and no side effects from subsequent evaluations have yet been performed.

Sequence points also come into play when the same variable is modified more than once within a single expression. An often-cited example is the C expression i=i++, which apparently both assigns i its previous value and increments i. The final value of i is ambiguous, because, depending on the order of expression evaluation, the increment may occur before, after, or interleaved with the assignment. The definition of a particular language might specify one of the possible behaviors or simply say the behavior is undefined. In C and C++, evaluating such an expression yields undefined behavior.[1]

As it happens, I get exactly the same answer - "9" - on both MSVC (Windows) and gcc (Linux). I also get a warning whether I compile with gcc ("C") or g++ (C++):

$ g++ -o tmp -Wall -pedantic tmp.cpp
tmp.cpp: In function "main(int, char**)":
tmp.cpp:7: warning: operation on "c" may be undefined
$ ./tmp
c=9...
paulsm4
  • 114,292
  • 17
  • 138
  • 190
3

Undefined behavior means anything can happen.

The output is 9, but with different compilers or different compiler switches, it might also be 12, 0 or 2147483647.

Dennis
  • 14,264
  • 2
  • 48
  • 57
1

the C spec leaves many things undefined, and they're pretty much left to the discretion of whoever implements the language (i.e. writes a compiler). Among these undefined things is the order of evaluation of various parts of the expression.

For instance, is the multiplication calculated first followed by both ++es, or is one ++ calculated first, then the multiplication, then the other ++?

Nathan Fellman
  • 122,701
  • 101
  • 260
  • 319
  • Order of evaluation isn't undefined behavior - it's unspecified behavior. The difference is that the compiler can't do whatever it wants - it still has to evaluate the expression correctly. It just doesn't have to perform that evaluation in any particular order (excepting certain operations which do impose an order). On the other hand, the expression `i += ++i` is undefined - there *is no* correct behavior and the compiler is free to do whatever it wants. – Michael Burr Apr 29 '12 at 09:16
0

Depending on how the compiler is written, what flags are used to compile, the phase of the moon, etc, the answer could be 9, 16, 20, or it could produce nasal demons. Always try to avoid confusing code and undefined behavior. Look up sequence points on how to avoid this.

Matt Sieker
  • 9,349
  • 2
  • 25
  • 43
0

One way of thinking about sequence points and why your example has both unspecified behavior and undefined behavior is by considering an implementation which first introduces temporary variables:

Such an implementation might handle post increment as follows:

tmp_1=c;              // read 'c'
tmp_2 = tmp_1 + 1;    // calculate the incremented value
c = tmp_2;            // write to 'c'
tmp_1;                // the result of the expression

The original expression (c++)*(c++) has two sequences:

lhs_1=c;              // read 'c'
lhs_2 = lhs_1 + 1;    // calculate the incremented value
c = lhs_2;            // write to 'c'
lhs_1;                // the resulting value of the expression

rhs_1=c;              // read 'c'
rhs_2 = rhs_1 + 1;    // calculate the incremented value
c = rhs_2;            // write to 'c'
rhs_1;                // the resulting value of the expression

The order may be:

lhs_1=c;              // read 'c'
lhs_2 = lhs_1 + 1;    // calculate the incremented value
c = lhs_2;            // write to 'c'

rhs_1=c;              // read 'c'
rhs_2 = rhs_1 + 1;    // calculate the incremented value
c = rhs_2;            // write to 'c'

lhs_1 * rhs_1         // (3 * 4) new value of 'c' is 5

Or:

lhs_1=c;              // read 'c'
rhs_1=c;              // read 'c'

lhs_2 = lhs_1 + 1;    // calculate the incremented value
c = lhs_2;            // write to 'c'

rhs_2 = rhs_1 + 1;    // calculate the incremented value
c = rhs_2;            // write to 'c'

lhs_1 * rhs_1         // (3 * 3) new value of 'c' is 4

Or:

rhs_1=c;              // read 'c'
rhs_2 = rhs_1 + 1;    // calculate the incremented value
c = rhs_2;            // write to 'c'

lhs_1=c;              // read 'c'
lhs_2 = lhs_1 + 1;    // calculate the incremented value
c = lhs_2;            // write to 'c'

lhs_1 * rhs_1         // (4 * 3) new value of 'c' is 5

....etc.

The unspecified behavior is that it can evaluate the lhs or the rhs first. The undefined behavior is that we are reading and writing to c without intermediate sequence points.

Richard Corden
  • 21,389
  • 8
  • 58
  • 85