Non volatile global var works in loop

Question

  1 #include <stdio.h>
  2 #include <stdbool.h>
  3 
  4 bool flag;
  5 
  6 static void foo(int a, int b)
  7 {
  8     printf("why\n");
  9     return;
 10 }
 11 
 12 int main()
 13 {
 14 
 15     while (!flag) {
 16         foo(10, 11);
 17     }
 18 
 19     return 0;
 20 }

build with aarch64-linux-gnu-gcc -O2 t.c

objdump with aarch64-linux-gnu-objdump -Sdf a.out > t

 55 0000000000400460 <main>:
 56   400460:   a9be7bfd    stp x29, x30, [sp, #-32]!
 57   400464:   910003fd    mov x29, sp
 58   400468:   f9000bf3    str x19, [sp, #16]
 59   40046c:   b0000093    adrp    x19, 411000 <__libc_start_main@GLIBC_2.17>
 60   400470:   3940c660    ldrb    w0, [x19, #49]
 61   400474:   35000140    cbnz    w0, 40049c <main+0x3c>
 62   400478:   f9000fb4    str x20, [x29, #24]
 63   40047c:   9100c673    add x19, x19, #0x31
 64   400480:   90000014    adrp    x20, 400000 <_init-0x3e8>
 65   400484:   91198294    add x20, x20, #0x660
 66   400488:   aa1403e0    mov x0, x20
 67   40048c:   97fffff1    bl  400450 <puts@plt>
 68   400490:   39400260    ldrb    w0, [x19]
 69   400494:   34ffffa0    cbz w0, 400488 <main+0x28>
 70   400498:   f9400fb4    ldr x20, [x29, #24]
 71   40049c:   52800000    mov w0, #0x0                    // #0
 72   4004a0:   f9400bf3    ldr x19, [sp, #16]
 73   4004a4:   a8c27bfd    ldp x29, x30, [sp], #32
 74   4004a8:   d65f03c0    ret

my concern is why #68 always load flag from memory? it's not a volatile type, isn't it only load from mem one time then read from register? if I remove c code #16, there's no function call in loop, I can see it load flag from memory only one time.

it seems function call in loop do the magic.

any explanation on this?

that's the reason why we use volatile, force it to read memory, instead of register — Leslie Li, Aug 18 '22 at 07:43
we are supposed to always use volatile in such variable, but if there's rule in this case we don't need specify it as volatile — Leslie Li, Aug 18 '22 at 07:48
any reason that "will never need volatile" for such global share variables? among threads and ISRs. And how about device registers? — Leslie Li, Aug 18 '22 at 07:55
but global variables are used among threads, I'm asking you why they don't have to be volatile modified — Leslie Li, Aug 18 '22 at 08:03
@273K You are having that same old confusion that's very tiresome to hear whenever a discussion about volatile pops up. It is true that volatile does not guarantee atomicity, _but that's not why_ you might be forced to use volatile when sharing a variable with threads or ISRs. The reason is rather that some compilers may not realize that an ISR or callback might be called by others than the application itself and therefore generate the wrong code. PC compilers tend to handle such situations much better than embedded compilers. This is a well-known classic bug since some 30 years back. — Lundin, Aug 18 '22 at 08:11
More info here: https://electronics.stackexchange.com/a/409570/6102 — Lundin, Aug 18 '22 at 08:11

Lundin · Accepted Answer · 2022-08-18T08:20:30.600

1

Because flag has external linkage and the compiler cannot assume that it won't get updated from another translation unit in the middle of execution.

Change flag to static or make it local and then the whole program will be replaced with an eternal loop calling puts over and over.

Edit: relevant disassembly from gcc 12.1 for ARM64 -O3 of the original code:

.L3:
        mov     x0, x20
        bl      puts
        ldrb    w0, [x19]
        cbz     w0, .L3

Changing flag to static creates an eternal loop:

.L2:
        mov     x0, x19
        bl      puts
        b       .L2

Keeping flag as extern linkage but commenting out the function call:

.L3:
        b       .L3

The last one happens since the loop body no longer contains no side effects such as printing, if the function call is removed. It is then pointless to check the variable.

edited Aug 18 '22 at 08:20

answered Aug 18 '22 at 07:57

Lundin

195,001
40
254
396

can't explain why if I remove foo(10, 11) line, it becomes an infinite loop, never check flag any more from memory – Leslie Li Aug 18 '22 at 07:59
I don't understand x86 assembly, could you try -O2 as I did? this is arch differ? – Leslie Li Aug 18 '22 at 08:07
@LeslieLi I added some ARM64 disassembly. – Lundin Aug 18 '22 at 08:18
In 3rd case, why not compiler assume there's outside change on flag and may get rid of the loop? – Leslie Li Aug 18 '22 at 08:24
why adding function call will make it "the compiler cannot assume that it won't get updated from another translation unit in the middle of execution" – Leslie Li Aug 18 '22 at 08:26
@Lundin BTW either there is a bug in the [clang compiler](https://www.godbolt.org/z/xrG7jP5G8), or I need more coffee – Jabberwocky Aug 18 '22 at 08:30
@LeslieLi I'm guessing the part you are looking for is the C17 6.8.5 "An iteration statement whose controlling expression is not a constant expression, that performs no input/output operations-... ...-may be assumed by the implementation to terminate". This loop's controlling expression is not a constant expression, but in case the function is there, it performs I/O and may be assumed to terminate - hence the need to check the variable. But when it no longer performs I/O, the compiler may assume that it will never terminate. – Lundin Aug 18 '22 at 08:34
1

@Jabberwocky clang is completely broken when it comes to eternal loops. https://stackoverflow.com/questions/59925618/how-do-i-make-an-infinite-empty-loop-that-wont-be-optimized-away Though I think they finally fixed it in some fairly recent update. The code you linked seems compilant though, the compiler may assume the loop to terminate. Make the variable volatile and it will have to be checked. – Lundin Aug 18 '22 at 08:36
thanks @Lundin, it's still hard to understand it clearly. According to c17 6.8.5, any function call could be I/O operations, hence it check the expression? – Leslie Li Aug 18 '22 at 09:30
@LeslieLi Since the function is in the same translation unit in this case, the compiler should be smart enough to examine the function declaration and look for side effects such as I/O, printf in this case. It boils down to C's core definitions of the "abstract machine" which says: "Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all _side effects_, which are changes in the state of the execution environment." Side effects may not get optimized away. – Lundin Aug 18 '22 at 09:58
@Lundin: I think it's the other way around. If the loop doesn't do any I/O, then the compiler may assume that it **does** terminate, so that if in fact it wasn't supposed to terminate, UB results. But if the loop does do I/O, 6.8.5 is inapplicable and the compiler must simply execute the code as it is, with no assumptions allowed either way. – Nate Eldredge Aug 23 '22 at 02:40
@NateEldredge As per the quote, it depends on the controlling expression - it can't be an integer constant expression or the loop is not allowed to terminate. It is somewhat common for small embedded systems to just hang main() in a `for(;;){}` and every functionality is executed through interrupts/callbacks. Like for example a microcontroller that does nothing but taking serial bus input and translating it to a PWM signal. – Lundin Aug 23 '22 at 06:17

score 1 · Answer 2 · answered Aug 23 '22 at 02:55

Just to say it a little more explicitly than Lundin's answer: the compiler is worried that printf might modify flag.

In general, when a call is made to any code whose source isn't currently visible to the compiler (e.g. because it's defined in another source file), the compiler has to assume it could do anything that well-defined C code can possibly do, and that includes modifying global variables. Standard library functions such as printf are generally not exempt from this assumption.

That said, since standard library functions have their behavior defined by the C standard, the compiler actually could make some assumptions if the compiler authors wanted to implement that. There are some commonly done optimizations based on such assumptions; e.g. math functions have no side effects except possibly setting errno. And in fact there are a few common optimizations of printf, e.g. to replace it by puts when the format string is constant, contains no format specifiers, and ends in \n; this happened in Lundin's example code.

So in principle, an ideal compiler could take advantage of the fact that printf is defined not to modify random global variables, and optimize out the reload of flag. But this would be a very specialized optimization whose benefits probably wouldn't be worth the cost of implementing it. A call to printf is already so expensive, relatively speaking, that the cost of a couple extra load instructions is likely to be lost in the noise.

Understood, so in general, variables like flag here still need to be volatile, it works here only because of function calls like printf, it could be explained to infinite loop by compiler if we modify loop body to code other than printf similar calls — Leslie Li, Aug 23 '22 at 05:46
@LeslieLi: "Volatile" in the sense that the compiler assumes they could change across the call to `printf`. Not the same as `volatile`. — Nate Eldredge, Aug 23 '22 at 13:48

Non volatile global var works in loop

2 Answers2