3

I have written a custom C code with main and another function being called by main. I am executing this function many times (~1 million). In one code I have declared this function as __inline and in another I have declared it as __declspec(noinline). I monitored the disassemblies using WinDBG and found the latter one is using push pop and branch like a normal function call and the former using no such instructions and properly inlining the function. yet the times for both of them are exactly the same. Following is the code : (Executing this code on A9 cortex CPU (Tegra 3) )

__inline int multifunc(int a, int b);

int main(int argc, char **argv) {

    unsigned long int timeBefore, i;
    unsigned long int timeAfter;
    unsigned int a[11500], j, k, l;
    double elapsed;
    timeBefore = GetTickCount();
    printf("\n%ld", timeBefore); 


    for(l=1; l<300;l++)
    {
for(i=0; i<11500; i++)
{
    j = i+l;
    k=1;
    a[i] = multifunc(j, k);
}

    }
printf("\n%ld", timeBefore);
    timeAfter = GetTickCount();

    printf("\n%ld", timeAfter);


    return -1;
}

__inline int multifunc(int a, int b)
{
    int d;
    d = a+b;
printf("%d", d);
    return d;
}

Can anyone explain me why ? All i change for second test is the __inline to __declspec(noinline).

BenMorel
  • 34,448
  • 50
  • 182
  • 322
  • IMHO it is likely that this does not matter at all since printf is a quite expensive function anyway; if printf takes 10000 cycles, an overhead of 6 cycles to initialize a stack frame should be barely noticeable. – fuz Mar 14 '13 at 06:19

3 Answers3

3

The printf() call is incredibly expensive. Function call times are dwarfed by the time required to execute printf().

Empirical test

How much slower is printf() than a function call? You won't get the same results. I'm using Linux, X11, and xterm.

109 function calls

__attribute__((noinline))
static int function(int x)
{
    return x;
}
int main(int argc, char *argv[])
{
    int i, a = 0;
    for (i = 0; i < 1000000000; i++)
        a += function(i);
    return a;
}

105 printf()

#include <stdio.h>
int main(int argc, char *argv[])
{
    int i;
    for (i = 0; i < 100000; i++)
        printf("%d\n", i);
    return 0;
}

Results

Wall clock time on my system shows that the printf() program takes 7.6 times as long as the one with function calls, which means that printf() takes 76,000 times as long as a function call. Leave the inlining decisions to your compiler.

Dietrich Epp
  • 205,541
  • 37
  • 345
  • 415
  • it's wort verifying/making sure that the compiler doesn't apply any clever optimizations like transforming it one giant call to puts. –  Mar 14 '13 at 06:54
  • @H2CO3: That would require loop unrolling a 100k element loop—thankfully, `nm -D` shows that `printf` is indeed used. – Dietrich Epp Mar 14 '13 at 06:57
  • Nice. And understandable, that would be enormous, really. But I've seen compilers doing pretty neat things, like in the case of the answer of mine to [this question](http://stackoverflow.com/questions/15114140/writing-binary-number-system-in-c-code), `clang` basically constant-folded an entire function body. –  Mar 14 '13 at 06:59
2

Your function includes a printf. Printing to the console is a several magnitudes slower than calling a single function, so no matter if it's inline or not, the biggest amount of time is spent in the printf.

nvoigt
  • 75,013
  • 26
  • 93
  • 142
  • But the point is even with printf, both functions should show some differene. Since the number of calls are quiet large, the timing of botht the functions should show a huge difference. those 10000 cycles are common to both programs. but the 6 cycles are extra in one of them. which should show a difference of several seconds. – sidg_hrdwre Mar 14 '13 at 06:49
  • 1
    Do both runs show the *exact same* tick count elapsed? They will likely vary. It will even vary by a few ticks even if you run the same program twice. The benefit of inlining vanishes in the background noise of whatever makes your program run at different speeds even when you run them twice in a row: Caching, background tasks, room temperature and random chance. – nvoigt Mar 14 '13 at 07:51
  • @nvoigt is correct. The signal-to-noise ratio will be very poor on a modern OS with non-trivial I/O subsystems. – Martin James Mar 14 '13 at 09:48
1

Just guessing, you are inlining a function with a printf in it. Since printf instruction are a lot more than a couple of push, the benefit of inlining is almost 0. Note that you marked that Windows is taking the same time: I guess the same will happen on Linux too.

Felice Pollano
  • 32,832
  • 9
  • 75
  • 115