19

I know that things like x = x++ + ++x invokes undefined behavior because a variable is modified multiple times within the same sequence point. That's thoroughly explained in this post Why are these constructs using pre and post-increment undefined behavior?

But consider a thing like printf("foo") + printf("bar"). The function printf returns an int, so the expression is valid in that sense. But the order of evaluation for the + operator is not specified in the standard, so it is not clear if this will print foobar or barfoo.

But my question here is if this also is undefined behavior.

klutt
  • 30,332
  • 17
  • 55
  • 95
  • `printf + printf` is fine. The `+` provides a sequence point so all functions are resolved before the number is added. – David C. Rankin Aug 30 '20 at 10:07
  • 2
    @DavidC.Rankin so which one is resolved first? – mangusta Aug 30 '20 at 10:15
  • 10
    @DavidC.Rankin: `+` does not provide a sequence point. – Eric Postpischil Aug 30 '20 at 10:54
  • @mangusta - it matters not which is evaluated first, only that they are both evaluated before the `+` is applied to the opperands. [C11 Standard - 6.5.6 Additive operators(p5)](http://port70.net/~nsz/c/c11/n1570.html#6.5.6p5) – David C. Rankin Aug 30 '20 at 11:44
  • 2
    @EricPostpischil [C11 Standard - 6.5 Expressions(p1)](http://port70.net/~nsz/c/c11/n1570.html#6.5) "*The value computations of the operands of an operator are sequenced before the value computation of the result of the operator.*" says just what I wrote, that computations of the operands (`printf()`) are sequenced before the addition is applied. – David C. Rankin Aug 30 '20 at 11:56
  • 6
    @DavidC.Rankin: (a) That is not a sequence point. Specifying ordering for value computations does not affect side effects, which a sequence point does. (b) Even if that did specify a sequence point, it is after the evaluations of the operands, so it would have no effect on the sequence of the operands relative to each other and would be irrelevant to the question asked here about two `printf` calls in this expression. – Eric Postpischil Aug 30 '20 at 17:57
  • I get that, the question was the addition UB, and the answer is no because both `printf()` will be sequenced before the addition. Now to the other issue, which `printf()` prints first, that wasn't what I was addressing. You are 100% correct that there is no sequencing of which `printf()` is sequenced first. – David C. Rankin Aug 31 '20 at 00:56
  • @DavidC.Rankin : Answers go in Answers, not Comments. If you are placing (even partial) Answers in Comments to the Question, you are doing it wrong. Comments are to improve the Question, which your comment chain fails to do. – Eric Towers Aug 31 '20 at 01:57
  • 1
    @DavidC.Rankin: The fact that the value computations of both `printf` are sequenced before the addition is not a reason that the behavior is not undefined. In `x++ + x++`, the value computations of both operands are sequenced before the addition, but the behavior is undefined. – Eric Postpischil Aug 31 '20 at 11:16
  • @EricPostpischil - that example does not apply to `print()` as the side effect (the text output of `printf`) does not effect the value that may be returned by the other `printf`. The behavior with `x++ + x++` is undefined because the indeterminate sequencing can effect the result of the operator. This whole diatribe started based on my use of "sequence point" rather than just using "sequenced" -- what is happening is understood. We are saying the same thing. – David C. Rankin Aug 31 '20 at 19:16
  • @DavidC.Rankin: The behavior with `x++` is undefined because 6.5 2 says behavior is undefined if two side effects on an object are unsequenced. Sequencing the value computations does not cure this. The situation with two `printf` calls is the same because the `FILE` object necessarily contains scalar objects governed by 6.5 2, and `printf` causes side effects on them. No, we are not saying the same thing. The rule about sequencing of value computations does not prevent undefined behavior. – Eric Postpischil Aug 31 '20 at 19:24
  • @EricPostpischil So, just to be clear, you are saying `printf + printf` is undefined behavior and the result of the addition is therefore undefined, correct? – David C. Rankin Aug 31 '20 at 19:49
  • @DavidC.Rankin: No. I am saying the rule about sequencing of value computations is irrelevant; it is **not** a reason that the behavior is not undefined. That is not an assertion regarding whether `printf("foo") + printf("bar")` is undefined or not. – Eric Postpischil Aug 31 '20 at 20:43

4 Answers4

16

printf("foo") + printf("bar") does not have undefined behavior (except for the caveat noted below) because the function calls are indeterminately sequenced and are not unsequenced.

C effectively has three possibilities for sequencing:

  • Two things, A and B, may be sequenced in a particular order, one of A before B or B before A.
  • Two things may be indeterminately sequenced, so that A is sequenced before B or vice-versa, but it is unspecified which.
  • Two things are unsequenced.

To distinguish between the latter two, suppose writing to stdout requires putting bytes in a buffer and updating the counter of how many bytes are in the buffer. (For this, we will neglect what happens when the buffer is full or should be sent to the output device.) Consider two writes to stdout, called A and B.

If A and B are indeterminately sequenced, then either one can go first, but both of its parts—writing the bytes and updating the counter—must be completed before the other one starts. If A and B are unsequenced, then nothing controls the parts; we might have: A puts its bytes in the buffer, B puts its bytes in the buffer, A updates the counter, B updates the counter.

In the former case, both writes are completed, but they can be completed in either order. In the latter case, the behavior is undefined. One of the possibilities is that B writes its bytes in the same place in the buffer as A’s bytes, losing A's bytes, because the counter was not updated to tell B where its new bytes should go.

In printf("foo") + printf("bar"), the writes to stdout are indeterminately sequenced. This is because the function calls provide sequence points that separate the side effects, but we do not know in which order they are evaluated.

C 2018 6.5.2.2 10 tells us that function calls introduce sequence points:

There is a sequence point after the evaluations of the function designator and the actual arguments but before the actual call. Every evaluation in the calling function (including other function calls) that is not otherwise specifically sequenced before or after the execution of the body of the called function is indeterminately sequenced with respect to the execution of the called function.

Thus, if the C implementation happens to evaluate printf("foo") second, there is a sequence point just before the actual call, and the evaluation of printf("bar") must have been sequenced before this. Conversely, if the implementation evaluates printf("bar") first, then printf("foo") must have been sequenced before it. So, there is sequencing, albeit indeterminate.

Additionally, 7.1.4 3 tells us:

There is a sequence point immediately before a library function returns.

Therefore, the two function calls are indeterminately sequenced. The rule in 6.5 2 about unsequenced side effects does not apply:

If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined…

(Not to mention the fact that stdout is not a scalar object.)

Caveat

There is a hazard that the C standard permits standard library functions to be implemented as function-like macros (C 2018 7.1.4 1). In this case, the reasoning above about sequence points might not apply. A program can force function calls by enclosing the name in parentheses so that it will not be treated as an invocation of a function-like macro: (printf)("foo") + (printf)("bar").

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • 2
    This seems like a nice answer, but to be honest, I*m not sure if you're claiming that it's undefined behavior or not. – klutt Aug 30 '20 at 11:46
  • You seem to be saying that we have defined behavior, but we don't know if we will have `foobar` or `barfoo`. Is that not undefined behavior? – Teepeemm Aug 30 '20 at 21:49
  • 2
    @Teepeemm: “Undefined behavior” is a specific term in the C standard that means the standard imposes no requirements in a particular situation. If either X or Y can happen, but one of them must happen, then that is not undefined behavior, because the standard is imposing some requirements even though it is not fully specifying what must happen. This is “unspecified behavior,” in which the standard allows two or more possibilities and does not impose requirements on which is chosen in any instance. – Eric Postpischil Aug 30 '20 at 22:33
  • @Teepeemm see https://stackoverflow.com/questions/2397984/undefined-unspecified-and-implementation-defined-behavior – M.M Aug 30 '20 at 23:44
  • Re: "the standard allows two or more possibilities": if the Standard does not define the order in which the side effects take place, then does it mean that [`i = x[i]++;`](https://stackoverflow.com/q/71843405/1778275) leads to unspecified behavior, rather than to undefined behavior? – pmor Apr 25 '22 at 21:22
  • @pmor: If `x[i]` and `i` are not both volatile and `x[i]` does not overlap `i`, then the observable behavior of `i = x[i]++;` is identical whether `x[i]` is incremented (due to `++`) first or `i` is updated (due to `=`) first. So there is no unspecified behavior as far as the C standard is concerned. The details about how these side effects are unspecified, of course, but they always are. If `i` and `x[i]` overlap, the behavior is undefined, due to the rule about multiplication unsequenced modifications. Otherwise, if they are volatile, the update order is unspecified. – Eric Postpischil Apr 25 '22 at 21:30
  • @EricPostpischil OK for `x[i]` and `i` are not both volatile and `x[i]` does not overlap `i`. Per C11, 3.4.4: unspecified behavior is "... other behavior where this International Standard provides two or more possibilities and imposes no further requirements on which is chosen in any instance". The order of side effects is exactly "two or more possibilities ...". Can you provide your answer/view to the [question](https://stackoverflow.com/questions/71843405/does-i-xi-lead-to-undefined-behavior)? – pmor May 05 '22 at 22:28
9

No it is not.

It is Unspecified Behaviour

enter image description here

0___________
  • 60,014
  • 4
  • 34
  • 74
  • 3
    The order is for sure unspecified, but I wonder if the fact of writing to stdout doesn't fall into the "undefined" category, due to `2) If a side effect on a scalar object is unsequenced relative to a value computation using the value of the same scalar object, the behavior is undefined. ` https://en.cppreference.com/w/cpp/language/eval_order To be clear: I'm not sure, thence I'm asking. – alagner Aug 30 '20 at 10:27
  • 2
    @alagner: The writes to `stdout` are indeterminately sequenced (they must occur in some order, with the writes separated from each other, although the order is not specified), not unsequenced (the writes are not prevented from overlapping, parts of them may occur out of order relative to parts of others). – Eric Postpischil Aug 30 '20 at 11:24
  • 9
    I think this answer would benefit a lot if you changed the picture for text and a link – klutt Aug 30 '20 at 15:02
  • @alagner: If one had two `printf(...)` or `fprintf(stdout,...)` calls, or two `fprintf(stderr,...)` calls they would have be performed in one or the other order. If one had one output to `stdout` and one to `stderr`, the outputs could be interleaved, but that could happen even if the calls were strongly sequenced. – supercat Aug 30 '20 at 15:05
  • `printf` doesn't modify the value `stdout`, the function body might modify some static object, but the sequencing rules for function calls mean this isn't a problem – M.M Aug 30 '20 at 23:49
  • @alagner : I'm not following. Which **scalar object** is being modified? `stdout` is very much not a scalar object. – Eric Towers Aug 31 '20 at 02:02
0

You're probably asking because a program that tries to read an unspecified value (e.g. uninitialised int) has undefined behaviour.

That is not the case with unspecified order or indeterminately sequenced operations. You don't know what you'll get, but the program has well-defined behaviour.

The writing to stdout doesn't cause a problem because the value is not "unspecified" in that sense either. You can think of it more as an implementation-defined value, as a result of the unspecified ordering.

tl;dr: not everything "unspecified" leads to being "undefined".

Asteroids With Wings
  • 17,071
  • 2
  • 21
  • 35
0

As noted elsewhere, if two function calls are used in an expression, a compiler may choose in Unspecified fashion which one will be invoked first, but all parts of one operation (chosen in Unspecified fashion) must precede all parts of the other. By contrast, if two operations are unsequenced, a compiler may interleave the parts of the operation.

A point I haven't seen mentioned, however, is that while many compilers are designed to process certain operations on primitive types in such a way that distinctions between "unsequenced" and "indeterminately sequenced" don't matter, some optimizers may produce machine code where such things could matter, especially in multi-threaded scenarios, so it's good to be concerned about such distinctions.

Consider a function like the following, if processed by gcc 9.2.1 with options -xc -O3 -mcpu=cortex-m0 [the Cortex-M0 is a popular current-production 32-bit core found in low-end microcontrollers]:

#include <stdint.h>
uint16_t incIfUnder32768(uint16_t *p)
{
    uint16_t temp = *p;
    return temp - (temp >> 15) + 1;
}

One might expect that if another thread were to change *p during the function, it would either perform the computation based upon the value of *p before the change, or perform the computation based upon the value after. The optimizer for gcc 9.2.1, however, will generate machine code as though the source code were written:

#include <stdint.h>
uint16_t incIfUnder32768(uint16_t *p)
{
    return *p - (*p >> 15) + 1;
}

If the value of *p were to e.g. change from 0xFFFF to 0, or 0 to 0xFFFF, the function might return 0xFFFF even though there would be no value *p could have held that would yield that result.

Although compilers when the Standard was written would almost invariably extend the semantics of the language by processing many actions "in a documented fashion characteristic of the environment" regardless of whether the Standard would require them to do so, some "clever" compiler writers seek to exploit opportunities where deviating from such behaviors would allow "optimizations" that might or might not actually make code more efficient.

supercat
  • 77,689
  • 9
  • 166
  • 211