2

vary simple c code:

#include <stdio.h>

int main() {
    int arr[] = {1, 2, 3, 4, 5};
    int *ptr = arr;
    printf("%d, %d\n", *ptr, *(++ptr));

    return 0;
}

compiled with gcc 4.8.2, result:

2, 2

compiled with clang 3.4, result:

1, 2

why does this happen?

helsinki
  • 743
  • 7
  • 14

3 Answers3

7

The comma used when calling a function does not come with a sequence point.

Therefore, this code *ptr, *(++ptr) invokes undefined behavior, because it attempts to access ptr twice between sequence points, for other purposes than to determine what value to assign to ptr.

This is defined by C11 6.5/2, in the following gibberish text:

If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. If there are multiple allowable orderings of the subexpressions of an expression, the behavior is undefined if such an unsequenced side effect occurs in any of the orderings.

In our case, the side effect (changing ptr with ++ in *(++ptr)) is unsequenced in relation to the value computation of the same object (*ptr).

And since it is undefined behavior, anything can happen. Since your program does "something", it behaves as expected (or rather, as "unexpected").

In addition, the order of evaluation of function parameters is unspecified, so you cannot know whether the first or second parameter in the function call gets evaluated first. The order can differ not only between compilers, but between source code lines in the same program. The compiler can evaluate them in any order it likes and it does not need to document how.

Community
  • 1
  • 1
Lundin
  • 195,001
  • 40
  • 254
  • 396
  • +1 for calling the C standard 'gibberish text'. It's true. – mic_e Apr 01 '14 at 12:57
  • @mic_e That text was perfectly fine in C99: `"Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored".` In C11 they rewrote it to the above, unreadable crap. – Lundin Apr 01 '14 at 13:00
1

Try this:

#include <stdio.h>

int main() {
  int arr[] = {1, 2, 3, 4, 5};
  int *ptr = arr;
  printf("%d, %d\n", ptr[0], ptr[1]);
  /* Or this
     printf("%d, %d\n", ptr[1], ptr[0]);
     whichever you meant
  */
  return 0;
}

There is order defined for the order of evaluation of function arguments, the compilers can choose whatever they like. I can do *(++ptr) first, followed by *ptr , which would result in argmuents 2, 2 being passed to printf, or the other way around, which results in 1, 2 being passed to printf.

Gábor Buella
  • 1,840
  • 14
  • 22
  • 3
    And clang actually has a warning `-Wunsequenced` which is part of `-Wall`. The zeroth mistake is developing without even basic warnings. – Benjamin Bannier Apr 01 '14 at 12:31
  • 1
    This doesn't fully answer the question, since the result does not only rely on the order of evaluation, but also invokes undefined behavior. – Lundin Apr 01 '14 at 12:47
1

The evaluation order in the printf line is unspecified, which alone is enough to allow both interpretations.

In addition, there are no sequence points between the two pointer accesses. This causes undefined behaviour (read: Your program might crash, work as you expect, or upload your harddrive contents to the internet, depending on the current day of week, and what compiler you're using).

If you compile with warnings, you'll notice the following:

mic@mic-nb $ gcc test.c -std=c11 -Wall -Wextra -pedantic
test.c: In function ‘main’:
test.c:6:29: warning: operation on ‘ptr’ may be undefined [-Wsequence-point]
  printf("%d, %d\n", *ptr, *(++ptr));
                             ^
mic@mic-nb $ clang test.c -std=c11 -Wall -Wextra -pedantic
test.c:6:29: warning: unsequenced modification and access to 'ptr' [-Wunsequenced]
        printf("%d, %d\n", *ptr, *(++ptr));
                            ~~~    ^
1 warning generated.

Life tip: Always compile with -Wall -Wextra -pedantic, always fix all warnings, always test with both clang and gcc, and you'll have a whole lot less errors. I even add -Werror to my release build configs.

mic_e
  • 5,594
  • 4
  • 34
  • 49
  • As clang tells you, there is an unsequenced modification of ptr. Which in turn means that the program invokes undefined behavior. So the problem is not just the order of evaluation. – Lundin Apr 01 '14 at 12:49
  • Note to self: Do a clang fork that will implement the kind of undefined behaviour I'm describing in my post. – mic_e Apr 01 '14 at 12:59
  • +1 for the compiler fork that actually *does* something truly evil upon UB. *That* will finally teach them. Although, since not everybody has a fast upload, I'd go for something like nuking the hard drive's partition table, then attempting to crash the HD heads. ;-) – DevSolar Apr 01 '14 at 13:06
  • The edit isn't correct, because there are two separate issues here, which you are getting mixed up: accessing a variable twice between sequence points, which is _undefined_ behavior (anything can happen). And the order of evaluation of function parameters, which is _unspecified_ behavior (the compiler can do as it pleases in a reliable, deterministic, yet completely undocumented manner). So relying on unspecified behavior will not upload your harddrive to the internet. – Lundin Apr 01 '14 at 13:06
  • Try to compile the same code as `(0,*ptr), (0,*(++ptr))` to add some sequence points, then the warnings should go away. But the order of evaluation remains unspecified and you may still get different results on the same compiler, or on different compilers. – Lundin Apr 01 '14 at 13:09
  • @Lundin: Thanks, I hope I got it right now. If not, feel free to edit. – mic_e Apr 01 '14 at 13:27
  • @Lundin Unspecified behavior does not have to be reliable or deterministic. A hypothetical just-in-time compiler where the order of evaluation of `f()` and `g()` in `h(f(), g())` changed during execution would be compliant. http://shape-of-code.coding-guidelines.com/2011/06/18/fibonacci-and-jit-compilers/ – Pascal Cuoq Apr 06 '14 at 13:00
  • @PascalCuoq It is reliable and deterministic from the compiler's point-of-view. It may indeed change between left-to-right or right-to-left evaluation as it pleases, but it does this decision based on internal optimization rules. So while the programmer cannot know or predict the result, he can at least be sure that the program will not crash and burn just because it invokes unspecified behavior. Though of course, if the programmer decides to rely on a certain outcome from unspecified behavior, that might be just as bad as invoking undefined behvior. – Lundin Apr 07 '14 at 07:46