1

I understand from several post like this What are all the common undefined behaviours that a C++ programmer should know about?, that the behaviour is undefined and left to compiler.

But for given code, the output is 2 1 3 (ALWAYS, CONSISTENTLY)

#include<stdio.h>
int main(){
    int i = 1;
    printf("%d %d %d\n", i++, i++, i);
    return 0;
}

So, I would like to know, what order is gcc following, doesn't look like left-to-right or right-to-left? The output is same always, and this question is pertaining to just gcc 4.9.1 compiler.

Community
  • 1
  • 1
mtk
  • 13,221
  • 16
  • 72
  • 112
  • 2
    If you want to know what gcc did for this particular run, look at the assembly output (use `-S` switch). Bear in mind that this may change at any time, especially if you change compiler switches or use a different compiler or make a change anywhere else in the program. – M.M Aug 12 '15 at 02:52
  • 2
    "Always, consistently"? Really? Even if we restrict ourselves to a single specific build of the compiler, are you sure you tested all possible combinations of compiler settings? What about all possible variants of surrounding contexts? And if you didn't, how can you make that "always, consistently" claim? – AnT stands with Russia Aug 12 '15 at 03:11
  • 1
    In any case, what makes you think that it is *deliberately following* some order as opposed to producing the same order unintentionally, out of raw natural primeval determinism of computer programs? – AnT stands with Russia Aug 12 '15 at 03:14
  • @Ant no i didn't test with different settings. I mean by 'always, consistently' is that i thought for different runs the output should be different.. and i didn't see such behaviour – mtk Aug 12 '15 at 03:27
  • @mtk: You mean for different runs of the same, already compiled executable? If so, then no, it is a rather unrealistic thing to expect the behavior to change. It is possible to produce undefined behavior that might behave differently from one run of the same executable to another, but what you have above is not it. – AnT stands with Russia Aug 12 '15 at 03:32
  • And in that case I don't understand your reference to "what order is gcc following". A compiled executable does not have any attachment to gcc anymore. – AnT stands with Russia Aug 12 '15 at 03:32
  • 1
    I played around with this a bit. I'll call "A" "`i++`", and "B" "`i`". So `printf("%d %d %d\n", i++, i++, i)` is "AAB". Okay? `AAAB` is `3214`. `AAAAB` is 43215. `AAABAB` is 432515. `AAAABAABAB` is `7654321818`. The rule **seems** to be to perform all operations right-to-left, performing substitution at the same time, and then replace any variables. I don't have any ambitions to find out for certain. – Jeremy Aug 12 '15 at 03:49
  • [Undefined behavior](https://en.wikipedia.org/wiki/Undefined_behavior) doesn't mean that the behavior must change every run – phuclv Aug 12 '15 at 04:38
  • 1
    I'm voting to close this question as off-topic because it asks for UB to be explained for one version of gcc only. I see no lasting value in this for SO visitors. If someone wants UB investigated, they should do it themselves. – Martin James Aug 12 '15 at 09:31

4 Answers4

3

you specified gcc so i will state something very gcc specific.

this is undefined but if you consider how gcc implements call stack, you will know why it's consistent. basically, stack grows from high address to low address, and function like printf which has ... in its argument list will dump the arguments into the stack. given the direction of stack, there aren't too many choices:

func(a1, a2, a3, a4);

push from left to right:

high| ... |
    | a1  |
    | a2  |
    | a3  |
low | a4  |

or right to left:

high| ... |
    | a4  |
    | a3  |
    | a2  |
low | a1  |

the truth is, gcc takes the second approach. it may sound counter-intuitive, but given such stack construct, this is very nicely designed. if you have to access the arguments in sequence, you naturally form an array in this way, so each of arr[0], arr[1], etc will correspond to a correct argument in position.

so after knowing this, the evaluation order isn't very hard to comprehend any more, since gcc handles arguments from right to left any ways. if i need to implement a compiler handles function calls, why would i choose to handle arguments from left to right in this case and right to left in that case? that's more than confusing.

gcc is a very comprehensive compiler if you know some of its implementation. a technical decision often will drive to a very obvious consequence.

this is very gcc specific so things could change easily if somebody changes the growth direction of stack or just feel like creating special case. for instance, clang seems to behave the other way around.

Jason Hu
  • 6,239
  • 1
  • 20
  • 41
  • I like this answer, but is this apparent in the documentation anywhere else? – CinchBlue Aug 12 '15 at 02:48
  • @VermillionAzure i can't dig into the forest of documentation now but i can clearly inform that what [CSAPP](http://www.amazon.ca/Computer-Systems-Programmers-Perspective-3rd/dp/013409266X/ref=sr_1_1?ie=UTF8&qid=1439347803&sr=8-1&keywords=computer+system+a+programmer) introduces is exact the construct of `gcc` implementation. – Jason Hu Aug 12 '15 at 02:50
  • I like your knowledge but the lack of ability to quote something concrete is such a shame for GCC :( – CinchBlue Aug 12 '15 at 02:53
  • @VermillionAzure i will try to dig then tomorrow if i have a chance. it's not too good to read too much documentation before bed. – Jason Hu Aug 12 '15 at 02:57
  • Can you give a link so maybe we can start digging? – CinchBlue Aug 12 '15 at 02:57
  • @VermillionAzure i don't have secrete in me. all i have are google, `info gcc` and CSAPP by hand. at least you can try to grap that book and have a read. – Jason Hu Aug 12 '15 at 02:59
1

Undefined behavior means that one compiler may compile it differently than another compiler. This is because there is no standard.

In regards to the gcc 4.9.1 compiler, it's evaluating the middle one, then the left one, then the right one. It's hopping like a bunny. Maybe it goes from the middle to the left to the right in the general case.

Don Larynx
  • 685
  • 5
  • 15
  • i bet adding another `i++` will print `3 2 1 4` and another one will print `4 3 2 1 5`. – Jason Hu Aug 12 '15 at 02:32
  • Yes, or in the latter case, even `3 2 1 4 5`. – Don Larynx Aug 12 '15 at 02:33
  • 1
    Another possibility is that mutating arguments are evaluated right to left, and then the non-mutators are evaluated. There are really many ways to get this behavior. – jaggedSpire Aug 12 '15 at 02:33
  • this is the most "comprehensive" UB. i will explain. – Jason Hu Aug 12 '15 at 02:35
  • Eh, if anyone wants an example of the expanded behaviour, I made a longer chain on [Coliru](http://coliru.stacked-crooked.com/a/ecaa62233833e0b2). After all the warnings, it does print the `543216` pattern – jaggedSpire Aug 12 '15 at 02:37
1

GCC doesn't appear to have an extensive and comprehensive database or documentation concerning undefined behavior--if anything, it seems to be generally understood not to venture off into undefined territory. It does, however, have notes on implementation-defined behavior.

To be clear, I don't think that the i = i++ problem can be optimally standardized across platforms and implementations of the same compiler because of differing semantics of calling and computation across architectures: this article may be related or help clarify how an architecture can have a peculiar calling convention.

Also, even the C++ Standard allows for some "compiler room" to allow them to make optimizations in such a way by imposing restrictions on the programmer, though they are often quite small and esoteric, can matter, such as strict aliasing.

Execution order of individual expressions that inherently collide no matter how you fit them into an expression would probably fit under "compiler-space" where the compiler is generally assumed to have full control over this.

Also, due to the age and large volume of the GCC codebase, I'd doubt that this sort of documentation will exist if it doesn't already.

**Edit: User HuStmpHrrr seems to have a handle on how GCC is implemented. Looks like it'll take some digging!

CinchBlue
  • 6,046
  • 1
  • 27
  • 58
0

You can ask GCC to generate assembly code with:

gcc -S -o my_asm_output.s main.c

I got this :

call    ___main
movl    $1, 28(%esp)
movl    28(%esp), %edx
leal    1(%edx), %eax
movl    %eax, 28(%esp)
movl    28(%esp), %eax
leal    1(%eax), %ecx
movl    %ecx, 28(%esp)
movl    28(%esp), %ecx
movl    %ecx, 12(%esp)
movl    %edx, 8(%esp)
movl    %eax, 4(%esp)
movl    $LC0, (%esp)
call    _printf

where LC0:

LC0:
    .ascii "%d %d %d\12\0"
    .text
    .globl  _main
    .def    _main;  .scl    2;  .type   32; .endef

I'm not an expert of assembly but, it look like the print order will be printf("eax edx ecx\n"); with the correspondence increment:

movl    $1, 28(%esp)
movl    28(%esp), %edx
leal    1(%edx), %eax
leal    1(%eax), %ecx

But it is just a fast speculation, maybe I skip something important.

JoseleMG
  • 302
  • 5
  • 18