2

What does the rule about sequence points say about the following code?

int main(void) {
    int i = 5;
    printf("%d", ++i, i); /* Statement 1 */
}

There is just one %d. I am confused because I am getting 6 as output in compilers GCC, Turbo C++ and Visual C++. Is the behavior well defined or what?

This is related to my last question.

Community
  • 1
  • 1
jaya
  • 345
  • 4
  • 6

5 Answers5

7

It's undefined because of 2 reasons:

  1. The value of i is twice used without an intervening sequence point (the comma in argument lists is not the comma operator and does not introduce a sequence point).

  2. You're calling a variadic function without a prototype in scope.

  3. The number of arguments passed to printf() are not compatible with the format string.

  4. the default output stream is usually line buffered. Without a '\n' there is no guarantee the output will be effectively output.

pmg
  • 106,608
  • 13
  • 126
  • 198
  • `3` is not a correct reason. You can have number of arguments greater than the number of format specifiers. BTW +1 for mentioning 2. – Prasoon Saurav Mar 22 '11 at 11:25
  • 1
    I'm not sure about 4, either. stdout is flushed at program exit. What the terminal does with output that isn't line-terminated is outside the scope of the standard, but it's all written out of the program regardless of how stdout is or isn't buffered. – Steve Jessop Mar 22 '11 at 12:13
  • Thanks @Steve. I tend to consider the lack of a terminating `'\n'` an error because some utilities I use do not consider characters after the last line break. That is a bug in the utilities, not in the C language :) – pmg Mar 22 '11 at 12:26
  • or possibly it's a bug in this program, since it doesn't write the output format that is expected by those utilities. Just not a UB bug. – Steve Jessop Mar 22 '11 at 12:29
6

All arguments get evaluated when calling a function, even if they are not used, so, since the order of evaluation of function arguments is undefined, you have UB again.

Björn Pollex
  • 75,346
  • 28
  • 201
  • 283
  • Technically, it isn't UB but unspecified behaviour. See last paragraph on the first page of ISO 9899:1999 Annex J.1. – Lundin Mar 22 '11 at 11:48
  • @Lundin: I am not sure if it is the same in C++03 as in C99. – Björn Pollex Mar 22 '11 at 11:52
  • 1
    @Lundin: in C++ the order of evaluation of function arguments is unspecified, meaning that in this case the behavior is undefined because of 5/4 (last sentence). C isn't quite so explicit that if *any* permitted order breaks the rules then behavior is undefined, but in this case one order of evaluating the arguments breaks the "shall" rules of 6.5/2, and the other doesn't. Since you can't predict which order they'll be evaluated, it's unspecified whether behavior is defined or not, which isn't a good place to be. – Steve Jessop Mar 22 '11 at 12:27
  • @Steve Jessop: I think this is an interesting point for discussion, and I am not so sure. The order of execution is unspecified, but that only has two possible outcomes, either `i` or `++i` is executed before the other. Because `i` has no side effects and is ignored by the function ("%d" will only use the first argument) then the output of the `printf` is necessarily fixed. I know I am treading a fine line here, and I am not really sure of this, but I don't think it is undefined. – David Rodríguez - dribeas Mar 22 '11 at 12:40
  • @David Rodríguez - dribeas : After a bit of researching I've found an exact dupe: http://stackoverflow.com/questions/3450582/c-programming-is-this-undefined-behavior – Prasoon Saurav Mar 22 '11 at 12:52
  • @David Rodriguez: The problem isn't with i, but with ++i. It has to update the value of i sometimes before the next sequence point. If it does it so that a pendning write and a pending read "meets", some hardware (no reference handy:) might reject that. So the language standard says: "Don't do it!". – Bo Persson Mar 22 '11 at 12:55
  • @dribeas: in the case where `i` is evaluated before `++i`, the "prior value" of `i` is accessed other than to determine the new value, without a sequence point. OK, so `printf` doesn't use it, but varargs are passed by value not reference, and so an lvalue-to-rvalue conversion is required (aka, "reading the value of i") as part of argument evaluation. – Steve Jessop Mar 22 '11 at 13:21
3

I think it's well defined. The printf matches the first % placeholder to the first argument, which in this instance is a preincremented variable.

forsvarir
  • 10,749
  • 6
  • 46
  • 77
  • 2
    It's undefined, because those arguments are still evaluated just like normal; the only difference is that printf won't use that value. – Puppy Mar 22 '11 at 11:15
  • 1
    @DeadMG Doesn't that mean it's defined? As the second parameter doesn't modify the value and printf doesn't use the second parameter, it's value is irrelevant to the output / behaviour. – forsvarir Mar 22 '11 at 11:25
  • @forsvarir: Say `i=5`. If `i` evaluated first, then the function is called with `6,5` as arguments. If `++i` is evaluated first, then the function is called with `6,6` as arguments. It does not make a difference here, but in other scenarios it might. – Björn Pollex Mar 22 '11 at 11:28
  • There is only one way for the above code to behave (print "6" and let i==6 afterwards). That's the definition of defined behaviour. – Tim Mar 22 '11 at 11:34
  • 1
    @Space_C0wb0y: I guess it depends how you define 'defined behaviour'. I agree that what gets passed to printf isn't defined and so it's something to be cautious of, however for the example given, the measurable behaviour is defined (it will always output 6 to stdout). But I'm willing to acknowledge I may have misinterpreted the question :) – forsvarir Mar 22 '11 at 11:37
  • @forsvarir: The point is, that undefined behavior can make *anything* happen. If the compiler does some optimization that relies on the value of `i` not being read *and* modified between two sequence points, this could crash the program. – Björn Pollex Mar 22 '11 at 11:44
  • 3
    The behavior is undefined. It has nothing to do with what gets passed to printf, or whether it is used or not. The standard says that behavior is undefined if you modify an object, and you access it elsewhere without an intervening sequence point, other that to determine the new value. Theoretically, at least, printf may never be called; the program could crash before that. – James Kanze Mar 22 '11 at 11:51
  • @Space_C0wb0y: Whilst I’m sceptical this would happen in practice (see Ferruccio’s comment on the question), I take your point. Thanks for the lesson. – forsvarir Mar 22 '11 at 12:34
0

According to this documentation, any additional arguments passed to a format string shall be ignored. It also mentions for fprintf that the argument will be evaluated then ignored. I'm not sure if this is the case with printf.

Don Kirkby
  • 53,582
  • 27
  • 205
  • 286
xyzcl
  • 1
  • Every argument get's evaluated /before/ the function call, and `printf` /then/ decides to ignore some of them. Nevertheless you delivered those parameters so you have undefined behaviour™. – filmor Mar 22 '11 at 11:28
0

All arguments are evaluated. Order not defined. All implementations of C/C++ (that I know of) evaluate function arguments from right to left. Thus i is usually evaluated before ++i.

In printf, %d maps to the first argument. The rest are ignored.

So printing 6 is the correct behaviior.

I believe that the right-to-left evaluation order has been very very old (since the first C compilers). Certainly way before C++ was invented, and most implementations of C++ would be keeping the same evaluation order because early C++ implementations simply translates into C.

There are some technical reasons for evaluating function arguments right-to-left. In stack architectures, arguments are typically pushed onto the stack. In C, you can call a function with more arguments than actually specified -- the extra arguments are simiply ignored. If arguments are evaluated left-to-right, and pushed left-to-right, then the stack slot right under the stack pointer will hold the last argument, and there is no way for the function to get at the offset of any particular argument (because the actual number of arguments pushed depends on the caller).

In a right-to-left push order, the stack slot right under the stack pointer will always hold the first argument, and the next slot holds the second argument etc. Argument offsets will always be deterministic for the function (which may be written and compiled elsewhere into a library, separately from where it is called).

Now, right-to-left push order does not mandate right-to-left evaluation order, but in early compilers, memory is scarce. In right-to-left evaluation order, the same stack can be used in-place (essentially, after evaluating the argument -- which may be an expression or a funciton call! -- the return value is already at the right position on the stack). In left-to-right evaluation, the argument values must be stored separately and the pushed back to the stack in reverse order.

Would be interested to know the true history behind right-to-left evaluation though.

Stephen Chung
  • 14,497
  • 1
  • 35
  • 48