8

I am looking over some code to review and have come across a busy wait as such:

int loop = us*32;
int x;
for(x = 0;x<loop;x++)
{
    /*do nothing*/      
}

I seem to recall reading that these empty loops can be optimized away. Is this what would happen here or can this work?

Firedragon
  • 3,685
  • 3
  • 35
  • 75
  • 1
    http://stackoverflow.com/questions/3527829/is-this-a-bug-in-the-intel-c-compiler-icc/3527862#3527862 – zw324 Apr 24 '12 at 14:42
  • `us`? if this code works that must be a really slow CPU. – Karoly Horvath Apr 24 '12 at 14:45
  • @KarolyHorvath It's an embedded system and probably does have some really slow processors in (by todays standards). it's not something I am totally familiar with – Firedragon Apr 24 '12 at 14:47

5 Answers5

16

The answer is yes, the compiler can optimize out the loop.

Use the volatile qualifier to avoid the optimization:

int loop = us * 32;
volatile int x;
for (x = 0; x < loop; x++)
{
    /*do nothing*/      
}

If you are programming in the embedded world read the documentation of your compiler as they usually provide delay functions that wait for a certain number of cycles or microseconds passed in parameter.

For example, avr-gcc has the following function in util/delay.h:

void _delay_us(double __us);
ouah
  • 142,963
  • 15
  • 272
  • 331
  • 1
    Read a beautiful tutorial on why the volatile keyword should be used: http://www.embedded.com/electronics-blogs/beginner-s-corner/4023801/Introduction-to-the-Volatile-Keyword – Prabhpreet Jun 16 '14 at 13:57
14

You're at the mercy of the compiler. Indeed if it's smart it will detect it's a noop. Incidentally, Neil Butterworth has a nice post where he also touches on this subject.

cnicutar
  • 178,505
  • 25
  • 365
  • 392
  • Thanks for the answer. It was something in the back of my mind saying this sort of thing could be a problem – Firedragon Apr 24 '12 at 14:46
  • Not really, even when you can't emit inline assembly code the compiler won't (and can't) throw away expressions with local side-effects. – Adriano Repetti Apr 26 '12 at 08:34
5

It's something terribly non-portable.

In some compilers one of these may works (but you have to check with full optimization enabled, the empty instruction may be thrown away):

for (i = 0; i < spinCount; )
   ++i; // yes, HERE

or:

for (i = 0; i < spinCount; ++i)
   ((void)0);    

If you're lucky enough then your compiler may provide a macro or an intrinsic function that will compiled to the nop assembly instruction, something like __noop in MSVC.

As last resource you can simply add a single assembly instruction (it's compiler dependent, it may be __asm or something like that) to execute...nothing, like this:

for (i = 0; i < spinCount; ++i)
   __asm nop

or (check your compiler documentation):

for (i = 0; i < spinCount; ++i)
   asm("nop");

EDIT
If you do not have a noop instruction and you can't add assembly code (I'm sorry, what kind of compiler you're using?) you can rely on the assumption that an instruction with a side effect won't be optimized away (or, as posted by @ouah, an access to a variable declared volatile).

Adriano Repetti
  • 65,416
  • 20
  • 137
  • 208
5

Nothing in the language standard forbids it, so compilers can do it if they are able.

Let's decompile GCC 4.8 to see what it does

Input code:

int main() {
    int i;
    for(i = 0; i < 16; i++)
        ;
}

Compile and decompile:

gcc -c -g -std=c99 -O0 a.c
objudmp -S a.o

Output:

a.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <main>:
int main() {
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
    int i;
    for(i = 0; i < 16; i++)
   4:   c7 45 fc 00 00 00 00    movl   $0x0,-0x4(%rbp)
   b:   eb 04                   jmp    11 <main+0x11>
   d:   83 45 fc 01             addl   $0x1,-0x4(%rbp)
  11:   83 7d fc 0f             cmpl   $0xf,-0x4(%rbp)
  15:   7e f6                   jle    d <main+0xd>
  17:   b8 00 00 00 00          mov    $0x0,%eax
        ;
}
  1c:   5d                      pop    %rbp
  1d:   c3                      retq   

The loop is there: that jle jumps back.

With -O3:

0000000000000000 <main>:
   0:   31 c0                   xor    %eax,%eax
   2:   c3                      retq

which just returns 0. So it was completely optimized away.

The same analysis can be done for any compiler.

See also

Community
  • 1
  • 1
Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
2

Some compilers, like gcc, will detect that it's an empty for loop and specifically pessimize for that, with the expectation that you put it in there as a delay loop. You can read more about that at http://gcc.gnu.org/onlinedocs/gcc-4.7.0/gcc/Non_002dbugs.html

Mind you, this is compiler specific, so don't count on it with all compilers.

chmeee
  • 3,608
  • 1
  • 21
  • 28
  • 1
    Indeed, I've ran into this with mingw/msvc, where I had to debug code doing a busy wait on a variable modified by another thread, and the loop was 'optimized' (in the mingw case only with -O2) into a jmp statement jumping to itself (i.e. an infinite loop), much fun was had debugging that. :) – aphax Oct 30 '12 at 22:23