Overhead to declaring a variable within a function run in a loop?

Question

There has been asked that question already. The answer was "Stack space for local variables is usually allocated in function scope." and so there is no difference in overhead in declaring variables outside/inside of a loop.

Now, imagine that we have a snippet with a function inside of the loop:

void do_sth(int &i) {int var=i+1;}
int i = 0;
while(i < 100)
{
    do_sth(i);
    i++;
}

And a second snippet with a variable declared outside:

int i = 0;
int var;
while(i < 100)
{
    var = i+1;
    i++;
}

My question is - what is the overhead in case of the first snippet in a practical scenario (with a modern compiler)? If indeed there is an overhead, then how big is it? Is it comparable with, say, doing extra addition (operator+) on integers in each step of the loop?

The question is so unclear. What does the code do? The first one is more absurd. — Nawaz, Dec 09 '11 at 16:48
Profile it! You have both the code versions so profile it and find out for yourself instead of speculative answers that will start flowing in now. — Alok Save, Dec 09 '11 at 16:49
@Nawaz Well, the do_sth() function is defined in the question, but in general it will do more operations (additions, multiplications, etc.) and push the results to some outside containers. — pms, Dec 09 '11 at 16:50
The two snippets of code do different things, so you cannot compare them. — Kerrek SB, Dec 09 '11 at 16:52
That entirely depends on your compiler/compiler configuration. There is a good chance, that both versions result in the same assembly code due to function inlining. (If only your both versions would do the same). — Constantinius, Dec 09 '11 at 16:54
@KerrekSB they do the same thing (imagine that there are more additions/multiplications in the do_sth function that we want to run in the loop, and then the result in each step of the loop is pushed to a vector). — pms, Dec 09 '11 at 16:59
@pms: In the first snippet, `do_sth` is a no-op, so the two snippets certainly don't do the same thing with regard to visible effects. If you have something else in mind, perhaps you can elaborate a little bit on the context? — Kerrek SB, Dec 09 '11 at 17:02

score 2 · Answer 1 · answered Dec 09 '11 at 16:52

A modern compiler, with optimizations enabled, will most likely inline the function call, provided it complies with the inline requirements (no external linkage, etc.), and so the two versions will generate identical code.

If the function is not inlined, then there is indeed an overhead: a function call and a function return, with the argument passed in the stack. That is a bit more than a simple addition.

score 1 · Accepted Answer · answered Dec 09 '11 at 16:57

The best way to find out is to look at the disassemly code in debugger. In this case first code has the overhead of a function call in the loop. Depending on calling convention used for the function different things can happen for a function call. The usual scenario in assembly code is pushing the argument to the stack, calling the function, then function creates a stack frame, pops the stack to get the parameter, and on function return, popping the stack to get the callers caller's stack frame. For This code which function body is very short, the over head can be around 10 times of the actual function body. (10 instructions vs. 1 instruction). If you define the function as inline function, all the overhead goes away.

score 0 · Answer 3 · answered Dec 09 '11 at 16:51

0

Any reasonable compiler, if asked to optimize the code at even the most basic level, would generate exactly the same instructions for the two snippets.

answered Dec 09 '11 at 16:51

Ernest Friedman-Hill

80,601
10
150
186

score 0 · Answer 4 · answered Dec 09 '11 at 16:53

Current compilers are smart enough to check the usage of variables and functions, and optimize the code at both compile time and link time. Hence there should be negligible difference between the two pieces of code, after optimizations. This LLVM Link time optimization document should provide a better insight regarding the same.

score 0 · Answer 5 · answered Dec 09 '11 at 16:53

0

int i = 0;
int var;
while(i < 100)
{
    var = i+1;
    i++;
}

int i = 0;
while(i < 100)
{
    int var = i+1;
    i++;
}

Will generally produce exactly the same code. There are however good reasons to use the second.
It's intent is clearer and the code may be optomised better since the compiler knows that var is only required for the duration of the loop

answered Dec 09 '11 at 16:53

Martin Beckett

94,801
28
188
263

Ok, but my questions asks specifically for the scenario with the function call. I have few loops like that and so I want to avoid rewriting the code and put the common part into a function. But the question is also out of curiosity :) – pms Dec 09 '11 at 17:06
@pms - A function call will cost a couple of cycles, unless the compiler decides to inline it. But the best advice is to code for readability, then measure, then optomise if necessary – Martin Beckett Dec 09 '11 at 17:09

Grizzly · Answer 6 · 2011-12-09T17:03:16.753

I'm assuming that the compiler is allowed to optimise the code to the best of it's ability (otherwise talking about performance is somewhat pointless).

If the body of do_sth is visible when compiling the loop in your example, the compiler will most likely inline it and then remove the assignment to var (and allocation of stackspace for var) as dead code, so for this scenario the overhead is not really existing. If do_sth can't be inlined, the cost of the function call is more of a concern then the declaration of an int. And in case the function can be inlined there is a distinct possiblitiy of the compiler transforming the first version into the second one even if var isn't dead code. So for examples like this it really won't matter.

It can matter, if your variable is a more complex type (a non POD classtype). In that case the first version will call the constructor and destructor once for each iteration, while the second will call the assignment operator once for each iteration and constructor and destructor only once. Note however that this doesn't say anything about which version is faster (depends on the implementation of those methods).

score 0 · Answer 7 · answered Dec 09 '11 at 17:04

0

Are you sure you've framed the problem/question correctly ? From what you have posted - the do_sth function will have the usual function call overhead. If you inline it that overhead will go away.

answered Dec 09 '11 at 17:04

shekhar

1,372
2
16
23

score 0 · Answer 8 · answered Dec 09 '11 at 19:52

0

What I would want to know is - what percent of the overall program's execution time is spent within this loop?

For example, if this is only a test program, and you're executing that code 1e9 times and timing it, and doing nothing else, then it will make a significant difference if the function call is inlined or not.

In any realistic program in my experience, it's very rare for such a loop to be taking much percent of the time. I assume you don't need to be told this, but some programmers have to learn a sense of proportion. Getting a haircut won't help one lose weight.

answered Dec 09 '11 at 19:52

Mike Dunlavey

40,059
14
91
135

I totally agree with your statement about learning a sense of proportion. This is why I asked this question, and the kind of answer by @Kamyar Souri is exactly what I wanted to know. In the case of my program the loops are a main components of the program (simulations). I have never studied assembler so to get a sense of what exactly happens in such scenario I ask. – pms Dec 10 '11 at 00:20
@pms: Glad you got what you needed. I would encourage you to learn some assembler. It takes the mystery out of what the machine's doing. – Mike Dunlavey Dec 10 '11 at 00:34

Overhead to declaring a variable within a function run in a loop?

8 Answers8