12

I read here that there is a sequence point:

After the action associated with input/output conversion format specifier. For example, in the expression printf("foo %n %d", &a, 42), there is a sequence point after the %n is evaluated before printing 42.

However, when I run this code:

int your_function(int a, int b) {
    return a - b;
}

int main(void) {
    int i = 10;

    printf("%d - %d - %d\n", i, your_function(++i, ++i), i);
}

Instead of what I expect I get:

12 - 0 - 12

Meaning that there was not a sequence point created for the conversion format specifier. Is http://en.wikipedia.org wrong, or have I just misunderstood something, or is gcc non-compliant in this case (incidentally Visual Studio 2015 yields the same unexpected result)?

EDIT:

I understand that the order the arguments to your_function are evaluated and assigned to the parameters is undefined. I'm not asking about why my middle term is 0. I'm asking why the other two terms are both 12.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Jonathan Mee
  • 37,899
  • 23
  • 129
  • 288
  • 7
    According to [this](http://stackoverflow.com/questions/4176328/undefined-behavior-and-sequence-points) it is undefined behavior. I also believe `your_function(++i, ++i)` is undefined behavior as well. – NathanOliver Jan 06 '16 at 16:20
  • 4
    `your_function(++i, ++i)` is clearly UB. – Eugene Sh. Jan 06 '16 at 16:21
  • The sequence points described by the standard quote are within the body of the `printf()` function after it has been called. You have extreme undefined behaviour in the calling sequence before the `printf()` function is called — which means that any result is acceptable (including the one you got). – Jonathan Leffler Jan 06 '16 at 16:23
  • 1
    @lurker: Both increments have to be complete before `your_function()` is called; there is a sequence point after the arguments to a function call have been evaluated, and so the side-effects in the argument list are complete then. What is unclear is whether the same value is passed twice to `your_function()` — it is undefined; it is also undefined whether the other two values `i` passed to `printf()` are evaluated before, after or during the double increments in the other function call. – Jonathan Leffler Jan 06 '16 at 16:26
  • The simple thing is that The evaluation order of function arguments is *unspecified*. – Eugene Sh. Jan 06 '16 at 16:26
  • @NathanOliver Yeah I got that the function is non-deterministic, I just needed it to demonstrate the problem. – Jonathan Mee Jan 06 '16 at 16:27
  • So what *do* you expect? Output like `12 - 1 - 12`? You can achieve that by wrapping `++i` with `int f() { return ++i; }`. Of course `i` has to be made global for that. – Good Night Nerd Pride Jan 06 '16 at 16:28
  • @JonathanLeffler I see. Let's ignore your_function(). There is no sequence point between evaluating the arguments in printf. The quote I posted deals with action in the printf function. Therefore that quote doesn't make *printf( "" , i++ , i++ )* statements defined. – 2501 Jan 06 '16 at 16:29
  • I have attempted to cover the case [involving `printf()`](http://stackoverflow.com/a/34536741/1275169) in the canonical question/answer page here, which you may find useful. – P.P Jan 06 '16 at 16:35
  • What is the *action associated with input/output conversion format specifier*? For my understanding, it is just the printing for `%d`, so all of the printing actions can be done after all of the inputs have been evaluated. For `%n` it would be a special case, as it is requiring counting of the already printed characters. So *before* `%n` the preceding format specifiers has to be evaluated into corresponding strings – Eugene Sh. Jan 06 '16 at 16:37
  • @2501: That has been the case for some time, the classic example being `printf("foo %n %d\n", &a, 42);` where `%n` behaves as a sequence point – Elias Van Ootegem Jan 06 '16 at 17:58

3 Answers3

11

I think you misunderstood the text about the printf sequence points (SP). They are somehow an anomaly, and only with %n because this format specifier has side effects, and those side effects need to be sequenced.

Anyway, there is a SP at the beginning of the execution of printf() and after the evaluation of all the arguments. Those format-specifier SP are all after this one so they don't affect your problem.

In your example, the uses of i are all in function arguments, and none of them are separated with sequence points. Since you modify the value (twice) and use the value without intervening sequence points, your code is UB.

What the rule about the SP in printf means is that this code is well formed:

int x;
printf("%d %n %d %n\n", 1, &x, 2, &x);

even though the value of x is modified twice.

But this code is UB:

int x = 1;
printf("%d %d\n", x, ++x);

NOTE: Remember that %n means that the number of characters written so far is copied to the integer pointed by the associated argument.

NathanOliver
  • 171,901
  • 28
  • 288
  • 402
rodrigo
  • 94,151
  • 12
  • 143
  • 190
  • Mostly correct, but "none of them are separated with sequence points" isn't completely accurate. Specifically, there is a sequence point after the arguments to `your_function()` (and the 'function designator') are evaluated and before the function is called. So, both the increments are complete before that function is called. What you can't say is when the other arguments to `printf()` were evaluated w.r.t the call to `your_function()` — that is undefined, so the values passed to `printf()` are undefined. – Jonathan Leffler Jan 06 '16 at 16:32
  • @JonathanLeffler: Since all the arguments to both `printf()` and `your_function()` can be evaluated without intermediate SP, I'd say that "none of them are separated with SP" is more or less accurate (should I say "sequenced" intead of "separated"?). The SP you point actually sequences the calls to `your_function()` and `printf()`. – rodrigo Jan 06 '16 at 16:37
  • 1
    Sorta. It's messy. There is, emphatically, a sequence point after the arguments to `your_function()` are evaluated and before the function is called. The third argument to `printf()`, therefore, cannot be evaluated until the call to `your_function()` returns; and that call must therefore precede the SP before the call to `printf()`. But otherwise, there are no constraints on the order of evaluation of the arguments to `printf()`. At this stage, these minutiae don't really matter — the overall behaviour is UB because of the two increments to `i`, and anything can happen as a result of UB. – Jonathan Leffler Jan 06 '16 at 16:43
  • @JonathanLeffler: Messy indeed. That's why C++ repaced the "sequence point" concept with "sequenced before/after". But I disagree with "The third argument to printf(), therefore, cannot be evaluated until the call to your_function() returns". A valid order of evaluation of `f1(a, f2(b, c), d)` could be `d` , `b`, `c`, `f2()`, `a`, `f1()`. The only sequenced evaluations are `b`, `c` before `f2()` and `f2()`, `a`, `d` before `f1()`. – rodrigo Jan 06 '16 at 16:50
  • Your "I disagree" comment then goes on to state what I think I stated (noting that the result of `your_function()` _is_ the third argument to `printf()`), so … let's leave it at 'messy' and close the discussion. I note that the statement quoted is actually from Wikipedia, not from a standard. I went trying to find it in the standard, and can't find similar text. That's good, because I can't work out what the significance of the 'internal sequence points' claim would be — when the presence or absence of a sequence point between conversion specifications would be detectable. – Jonathan Leffler Jan 06 '16 at 16:54
  • @JonathanLeffler: The SP between conversions specifications would be detectable when those conversions have side effects, such as `%n` in my example. I didn't find this quote in the standard either, but I found another somewhat similar rule related to the callbacks from `qsort()` and `bsearch`. – rodrigo Jan 06 '16 at 16:57
  • I'd like to discuss this more but I also need to get to the office. Is email an option? See my profile. – Jonathan Leffler Jan 06 '16 at 17:08
  • @rodrigo `%n` has a side effect, yes, but how are you to observe the state of the program between them? – Random832 Jan 06 '16 at 22:05
8

Because this question was asked because of a comment-based discussion here, I'll provide some context:

first comment: The order of operations is not guaranteed to be the order in which you pass arguments to the function. Some people (wrongly) assume that the arguments will be evaluated right to left, but according to the standard, the behaviour is undefined.

The OP accepts and understands this. No point in repeating the fact that your_function(++i, ++i) is UB.

In response to that comment: Thanks to your comment I see that printf may be evaluated in any order, but I understood that to be because printf arguments are part of a va_list. Are you saying that the arguments to any function are executed in an arbitrary order?

OP asking for clarification, so I elaborated a bit:

Second comment: Yes, that's exactly what I'm saying. even calling int your_function(int a, int b) { return a - b; } does not guarantee that the expressions you pass will be evaluated left to right. There's no sequence point (a point at which all side effects of previous evaluations are performed). Take this example. The nested call is a sequence point, so the outer call passes i+1 (13), and the return value of the inner call (undefined, in this case -1 because i++, i evaluates to 12, 13 apparently), but there's no guarantee that this will always be the case

That made it pretty clear that these kinds of constructs trigger UB for all functions.


Wikipedia confusion

OP Quotes this:

After the action associated with input/output conversion format specifier. For example, in the expression printf("foo %n %d", &a, 42), there is a sequence point after the %n is evaluated before printing 42.

Then applies it to his snippet (prinf("%d - %d - %d\n", i, your_function(++i, ++i), i);) expeciting the format specifiers to serve as sequence points.
What is being referred to by saying "input/output conversion format specifier" is the %n specifier. The corresponding argument must be a pointer to an unsigned integer, and it will be assigned the number of characters printed thus far. Naturally, %n must be evaluated before the rest of the arguments are printed. However, using the pointer passed for %n in other arguments is still dangerous: it's not UB (well, it isn't, but it can be):

printf("Foo %n %*s\n", &a, 100-a, "Bar");//DANGER!!

There is a sequence point before the function is called, so the expression 100-a will be evaluated before %n has set &a to the correct value. If a is uninitialized, then 100-a is UB. If a is initialized to 0, for example, the result of the expression will be 100. On the whole, though, this kind of code is pretty much asking for trouble. Treat it as very bad practice, or worse...
Just look at the output generated by either one of these statements:

unsigned int a = 90;
printf("%u %n %*s\n",a,  &a, 10, "Bar");//90         Bar
printf("%u\n", a);//3
printf("Foo %u %n %*s\n",a, &a, 10-a, "Bar");//Foo 3      Bar < padding used: 10 - 3, not 10 - 6 
printf("%u\n", a);//6

In as you can see, n gets reassigned inside of printf, so you can't use its new value in the argument list (because there's a sequence point). If you expect n to be reassigned "in-place" you're essentially expecting C to jump out of the function call, evaluate other arguments, and jump back into the call. That's just not possible. If you were to change unsigned int a = 90; to unsigned int a;, then the behaviour is undefined.


Concerning the 12's

Now because the OP read up on sequence points, he correctly notices that this statement:

printf("%d - %d - %d\n", i, your_function(++i, ++i), i);

Is slightly different: your_function(++i, ++i) is a sequence point, and guarantees that i will be incremented twice. This function call is a sequence point because:

Before a function is entered in a function call. The order in which the arguments are evaluated is not specified, but this sequence point means that all of their side effects are complete before the function is entered

That means that, before printf is called, your_function has to be called (because its return value is one of the arguments for the printf call), and i will be incremented twice.
This could explain the output being "12 - 0 - 12", but is it guaranteed to be the output?

No

Technically, although most compilers will evaluate the your_function(++i, ++i); call first, the standard would allow a compiler to evaluate the arguments passed to sprintf left to right (the order isn't specified after all). So this would be an equally valid result:

10 - 0 - 12
//or even
12 - 0 - 10
//and
10 - 0 - 10
//technically, even this would be valid
12 - 0 - 11

Although the latter output is extremely unlikely (it'd be very inefficient)

Community
  • 1
  • 1
Elias Van Ootegem
  • 74,482
  • 9
  • 111
  • 149
4

Arriving at a clear answer to this question is strongly effected (even prevented) by the C rules on order of evaluation and UB.

The specified rules on order of evaluation are stated here:

C99 section 6.7.9, p23: 23 The evaluations of the initialization list expressions are indeterminately sequenced with respect to one another and thus the order in which any side effects occur is unspecified.

And, this function call will exhibit undefined behavior:

your_function(++i, ++i)

Because of UB, coupled with the rules on order of evaluation, accurate predictions on the expected outcomes for the following:

printf("%d - %d - %d\n", i, your_function(++i, ++i), i);

are impossible.

Edit
...I'm not asking about why my middle term is 0. I'm asking why the other two terms are both 12.

There is no guarantee which of the three arguments of the above function are called first. (because of the C's rules on order of evaluation). And if the middle function gets evaluated first, then at that point you have invoked Undefined Behavior . Who can really say why the other two terms are 12?. Because what happens to i when the second argument is evaluated is anyone's guess.

Community
  • 1
  • 1
ryyker
  • 22,849
  • 3
  • 43
  • 87
  • I've edited the question to provide clarification: "I'm not asking about why my middle term is 0. I'm asking why the other two terms are both 12." – Jonathan Mee Jan 06 '16 at 16:43
  • @JonathanMee - Because of the rules on order of evaluation, the middle argument could be evaluated first, resulting in UB for at least itself ( `your_function()` ), and possibly having a downstream effect on subsequent results. So could it not be possible that UB _bleeds through_ to the entire function: `printf("%d - %d - %d\n", i, your_function(++i, ++i), i);` ? – ryyker Jan 06 '16 at 16:46
  • Your statement "literally anything can happen" is false. There *is* a sequence point on the return of a function, so after `your_function` is called, `i` *will* be 12. This question was about whether the statement on http://en.wikipedia.org meant that I could depend upon `printf` arguments being called in order. – Jonathan Mee Jan 06 '16 at 17:09
  • @JonathanMee - That statement is a tongue in cheek reference to the link it follows, where nasal demons are used to describe UB. I will change it if it is a problem for you however. ***However***, I do not agree with your statement: _so after your_function is called, i will be 12_. That function is the _cause_ of UB. There is no guarantee what `i` will be. _[Elias](http://stackoverflow.com/a/34638527/645128)_ explains it well in his answer. (read toward the bottom) – ryyker Jan 06 '16 at 17:27
  • `your_function` is not the cause of UB, even just `printf("%d%d", i, ++i);` exhibits undefined behavior as per §1.9/15 (N3690). The cause of the UB is having two side-effects or a side effect and a value computation that are unsequenced wrt one another on a scalar object. -- I should clarify that in my example, while the value computations of function arguments are merely indeterminately sequenced, the side effect of `++i` is unsequenced wrt to the value computation of `i`. – Arne Vogel Jan 06 '16 at 19:03
  • @ArneVogel - No, agreed, The function `your_function` itself is not the _cause_ of UB. It is the arguments (two instances of same object) with increments that cause it. From C99: ***6.5 Expressions*** _2. If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined...._. The increments to `i` are side effects, and because they are being applied to the same scalar object, `i`, UB is invoked. – ryyker Jan 06 '16 at 20:44