inline functions

Question

Possible Duplicate:
Benefits of inline functions in C++?

I have a confusion regarding the inline function.

People say inline functions saves CPU time by replacing the function with the original code, but that it increases the size of the code when compared to a normal function.

So the real question is if I keep on calling that inline function within a loop for 10 times, will the code size get increased.

Suppose the inline function size is 2 bytes, will it increase by 20 bytes?

Can anyone explain it to me?

Decide which language you're talking about. I suggest removing Java for a start. — Joe, Jun 24 '11 at 12:12
sorry i didnt know that i just know java developed from the concepts of c++ so i thought java might have inline concepts — jack, Jun 24 '11 at 12:17
@Qwerky Java's just-in-time compiler automatically inlines methods. But there is indeed no way in Java to explicitly say that a method should be inlined. — Jesper, Jun 24 '11 at 12:22
@Jesper, in fairness there is also no way to force the C++ compiler to inline a function, the `inline` keyword is but a mere suggestion. It may also inline functions you have not marked as such. — edA-qa mort-ora-y, Jun 24 '11 at 12:38
@jack, calling a function also has overhead. For tiny functions doing inline can actually reduce the code size. — edA-qa mort-ora-y, Jun 24 '11 at 12:39
`inline` is the compiler hint to use copy & paste when appropriate. — AJG85, Jun 24 '11 at 14:15
inline is universally ignored by the compiler as a hint to inline. The compiler writers learned a long time ago the programmers are really bad (and I mean really really bad) at deciding what needs to be inlined. The compiler makes the decision for you. So this is not something you should ever worry about. So don't worry the compiler is smarter than you. And let me be the first to welcome our new compiler overlords. — Martin York, Jun 24 '11 at 14:25
@Martin Yup, that was just pre-coffee humor. Nowadays `inline` is only used to avoid redefinition of function bodies by includes. I should have put hint in quotes as it's true the compiler will do whatever it feels like and that is probably better than what you could ask for anyway. — AJG85, Jun 24 '11 at 15:14
@Martin, I don't think I'd go that far. GCC does seem to honour the inline statement in a lot of cases. — edA-qa mort-ora-y, Jun 24 '11 at 16:12
@edA: Its not honoring them. It is just deciding that it is better to inline them than not (unless you force it to inline). Try removing the inline statement it will still inline them (though inline is required for the linker in some situations). — Martin York, Jun 24 '11 at 16:43

score 9 · Accepted Answer · answered Jun 24 '11 at 12:14

9

The same code will be executed 10 times. But still within a loop, so the code is not copied 10 times in a row. So size will not grow with the number of executions.

answered Jun 24 '11 at 12:14

Leif

2,143
2
15
26

1

Right, the "code bloat" argument (which is not a good one) applies when there are 10 or 20 different places in your code that call the function. That said, in C++ it's really up to the compiler whether to inline or not, and though you are given ways to ask for inlining they are not binding. So I recommend not worrying about this and trusting your compiler. – Kate Gregory Jun 24 '11 at 12:17
6

Well, 10 is a tricky number, assume 1000 and you're correct. For small loop sizes the compiler may also unroll the loop, thus actually creating 10 copies of the inline function. It depends on many factors. – edA-qa mort-ora-y Jun 24 '11 at 12:36
That's right. I just was thinking about adding this. – Leif Jun 24 '11 at 12:37
1

@edA-qa mort-ora-y: It may also unroll big loops, e.g. to get the loop in shape for SIMD instructions, it might unroll into groups of 4. – Sebastian Mach Jun 24 '11 at 12:49
@phresnel, @edA-qa mort-ora-y: indeed, the compiler might use a similar trick to Duff's Device to get the loop in shape, also unrolling a large loop (by blocks) means removing a number of check/jumps. – Matthieu M. Jun 24 '11 at 14:29
@edA-qa mort-ora-y @Matthieu M.: though it does not even have to use duff-loops if it splits up the loop into a loop and an epilogue for the remaining elements – Sebastian Mach Jun 24 '11 at 14:37

score 3 · Answer 2 · edited Jun 24 '11 at 12:33

I don't know why you'd think the number of loop iterations would matter. Let's see. Suppose you write this:

inline int foo() { return 5 * gargle(); }

/* later... */

for (size_t i = 0; i < 100; ++i)
{
  const int x = i * foo();
  baz(x + lookup[i]);
}

If foo gets inlined, then essentially the compiler treats your code as though you had written:

for (size_t i = 0; i < 100; ++i)
{
  baz(i * (5 * gargle()) + lookup[i]);
}

So the code only gets replaced at the call site, once.

(It's a separate matter entirely whether loop unrolling is happening.)

score 2 · Answer 3 · answered Jun 24 '11 at 12:33

It totally depends on you, your code, and your compiler. Imagine you have:

#include <vector>

int frob (int a, int b) {
    return a + b;
}

int main () {
    std::vector<int> results(20), lhs(20), rhs(20);
    for (int i=0; i<20; ++i) {
        results[i] = frob(lhs[i], rhs[i]);
    }
}

Now if your compiler optimizes for size, it might leave this as is. But if it optimizes for performance, it may (or may not, some compilers use rough heuristic measures to determine that) transform this to:

int main () {
    std::vector<int> results(20), lhs(20), rhs(20);
    for (int i=0; i<20; ++i) {
        results[i] = lhs[i] + rhs[i];
    }
}

If it optimizes even more, it might unroll the loop

int main () {
    std::vector<int> results(20), lhs(20), rhs(20);
    for (int i=0; i<20; i+=4) {
        results[i] = lhs[i] + rhs[i];
        results[i+1] = lhs[i+1] + rhs[i+1];
        results[i+2] = lhs[i+2] + rhs[i+2];
        results[i+3] = lhs[i+3] + rhs[i+3];
    }
}

Size increased. But if the compiler now decides to also to a bit of auto vectorization, it might transform again into a something not unsimilar to this:

int main () {
    std::vector<int> results(20), lhs(20), rhs(20);
    for (int i=0; i<20; i+=4) {
        vec4_add (&results[i], &lhs[i], &rhs[i]);            
    }
}

Size decreased.

Next on, the compiler, smart as always, unrolls again and kills the loop entirely:

int main () {
    std::vector<int> results(20), lhs(20), rhs(20);

    vec4_add (&results[i], &lhs[i], &rhs[i]);            
    vec4_add (&results[i+4], &lhs[i+4], &rhs[i+4]);
    vec4_add (&results[i+8], &lhs[i+8], &rhs[i+8]);
    vec4_add (&results[i+12], &lhs[i+12], &rhs[i+12]);
    vec4_add (&results[i+16], &lhs[i+16], &rhs[i+16]);
}

An optimization g++ will exercise if it can conclude enough is to replace a vector with an ordinary array

int main () {
    int results[20] = {0}, lhs[20] = {0}, rhs[20] = {0};

    vec4_add (&results[i], &lhs[i], &rhs[i]);            
    vec4_add (&results[i+4], &lhs[i+4], &rhs[i+4]);
    vec4_add (&results[i+8], &lhs[i+8], &rhs[i+8]);
    vec4_add (&results[i+12], &lhs[i+12], &rhs[i+12]);
    vec4_add (&results[i+16], &lhs[i+16], &rhs[i+16]);
}

It sees how everything is constant, and folds

int main () {
    int results[20] = {0}; // because every lhs[0]+rhs[0] == 0
}

It concludes that results is actually unused, and finally spits out:

int main() {
}

Any answer that is illustrative AND makes me laugh definitely deserves the +1. — buildsucceeded, Nov 13 '12 at 23:36

score 0 · Answer 4 · answered Jun 24 '11 at 12:21

When you use inline, you are telling the compiler to replace any calls to your inline method, with the code from that method. For example:

inline int min(int a, int b)
{
    return (a < b) ? a : b;
}

void some_method()
{
    int x = min(20, 30);
}

would be changed by the compiler to:

void some_method()
{
    int x = (20 < 30) ? 20 : 30;
}

If this was in a loop, it would still be just the one replacement, so it wouldn't increase code size in that particular situation.

That said, there are Problems With Inline Functions that should be considered. Frequently, letting the compiler decide what to inline will be more efficient than doing it yourself.

score 0 · Answer 5 · answered Jun 24 '11 at 12:45

0

Using the inline keyword give the compiler permission to inline function calls, which opportunity it may or may not take up.

The reason that this might make the program quicker is that the CPU won't have to make a function call, and won't have to push parameters onto the stack, so in fact the compiler might be able to generate much less code at the call site than when it performs a function call.

Additionally the optimiser might be able to re-order/eliminate instructions that are now closer together to give even better performance and even less code.

The only way to know if this happens is trial and error. You write it one way and measure performance and code size, and then write it the other way and test again.

answered Jun 24 '11 at 12:45

quamrana

37,849
12
53
71

The compiler doesn't need any permission to inline, and many compilers will inline functions which aren't declared inline. – James Kanze Jun 24 '11 at 14:07
@James: Does that make the `inline` keyword more like a way to stop the linker finding multiple definitions? – quamrana Jun 24 '11 at 14:20
I wouldn't put it in those terms, but sort of. It does mean that the linker can't treat the multiple definitions as an error (if they get that far). Of course, that's the formal meaning; the *intent* is that the compiler make a greater effort to inline the function than it otherwise would. And the motivation is to make the definition visible to the compiler in all translation units that use it; seeing the definition is a requirement for most compilers to inline the function. – James Kanze Jun 27 '11 at 07:49

inline functions

5 Answers5