6

I'm wondering if C++ will still obey the inline keyword when a function is passed as an agument. In the following example, would a new frame for onFrame be pushed onto the stack every time frame() is called in the while loop?

bool interrupt = false;

void run(std::function<void()> frame) {
    while(!interrupt) frame();
}

inline void onFrame() {
    // do something each frame
}

int main() {
    run(onFrame);
}

Or would changing to this have any effect?

void run(std::function<inline void()> frame) {
    while(!interrupt) frame();
}

If you have no definitive answer, can you help me find a way to test this? Possibly using memory addresses or some sort of debugger?

Evan Kennedy
  • 3,975
  • 1
  • 25
  • 31
  • 1
    [inline does not mean what you think it does](http://stackoverflow.com/q/1759300/212858). – Useless Mar 29 '16 at 16:39
  • related: http://stackoverflow.com/questions/6451866/why-use-functors-over-functions – NathanOliver Mar 29 '16 at 16:40
  • But what does this mean then? "*The intent of the inline keyword is to serve as an indicator to the optimizer that inline substitution of the function is preferred over function call, that is, instead of executing the call CPU instruction to transfer control to the function body, a copy of the function body is executed without generating the call.*" http://en.cppreference.com/w/cpp/language/inline – Evan Kennedy Mar 29 '16 at 17:02
  • 1
    As in my reply to the other copy of this comment: `intent != effect`. – Useless Mar 29 '16 at 17:16

3 Answers3

8

It's going to be pretty hard for the compiler to inline your function if it has to go through std::function's type-erased dispatch to get there. It's possible it'll happen anyway, but you're making it as hard as possible. Your proposed alternative (taking a std::function<inline void()> argument) is ill-formed.

If you don't need type erasure, don't use type erasure. run() can simply take an arbitrary callable:

template <class F>
void run(F frame) {
    while(!interrupt) frame();
}

That is muuch easier to inline for the compiler. Although, simply having an inline function does not in of itself guarantee that the function gets inlined. See this answer.

Note also that when you're passing a function pointer, that also makes it less likely to get inlined, which is awkward. I'm trying to find an answer on here that had a great example, but until then, if inlining is super important, wrapping it in a lambda may be the way to go:

run([]{ onFrame(); });
Community
  • 1
  • 1
Barry
  • 286,269
  • 29
  • 621
  • 977
3

still obey the inline keyword ... would a new frame ... be pushed onto the stack

That isn't what the inline keyword does in the first place (see this question for extensive reference).


Assuming, as Barry does, that you're hoping to persuade the optimiser to inline your function call (once more for luck: this is nothing to do with the inline keyword), function template+lambda is probably the way to go.

To see why this is, consider what the optimiser has to work with in each of these cases:

  1. function template + lambda

    template <typename F>
    void run(F frame) { while(!interrupt) frame(); }
    
    // ... call site ...
    run([]{ onFrame(); });
    

    here, the function only exists at all (is instantiated from the template) at the call site, with everything the optimizer needs to work in scope and well-defined.

    Note the optimizer may still reasonably choose not to inline a call if it thinks the extra instruction cache pressure will outweigh the saving of stack frame

  2. function pointer

    void run(void (*frame)()) { while(!interrupt) frame(); }
    
    // ... call site ...
    run(onFrame);
    

    here, run may have to be compiled as a standalone function (although that copy may be thrown away by the linker if it can prove no-one used it), and same for onFrame, especially since its address is taken. Finally, the optimizer may need to consider whether run is called with many different function pointers, or just one, when deciding whether to inline these calls. Overall, it seems like more work, and may end up as a link-time optimisation.

    NB. I used "standalone function" to mean the compiler likely emits the code & symbol table entry for a normal free function in both cases.

  3. std::function

    This is already getting long. Let's just notice that this class goes to great lengths (the type erasure Barry mentioned) to make the function

    void run(std::function<void()> frame);
    

    not depend on the exact type of the function, which means hiding information from the compiler at the point it generates the code for run, which means less for the optimiser to work with (or conversely, more work required to undo all that careful information hiding).


As for testing what your optimiser does, you need to examine this in the context of your whole program: it's free to choose different heuristics depending on code size and complexity.

To be totally sure what it actually did, just disassemble with source or compile to assembler. (Yes, that's potentially a big "just", but it's platform-specific, not really on-topic for the question, and a skill worth learning anyway).

Community
  • 1
  • 1
Useless
  • 64,155
  • 6
  • 88
  • 132
  • But what does this mean then? "*The intent of the inline keyword is to serve as an indicator to the optimizer that inline substitution of the function is preferred over function call, that is, instead of executing the call CPU instruction to transfer control to the function body, a copy of the function body is executed without generating the call.*" http://en.cppreference.com/w/cpp/language/inline – Evan Kennedy Mar 29 '16 at 17:04
  • 1
    Read the paragraph immediately after the one you quoted. The only _binding_ requirement is on linkage, and it's only a hint w.r.t the actual call site. – Useless Mar 29 '16 at 17:11
  • Thank you for all your help. I do have a problem with saying "That isn't what the `inline` keyword does in the first place". I feel that we should merely explain that compilers choose to implement the feature in a certain way rather than making a general statement about the language. – Evan Kennedy Mar 29 '16 at 18:17
  • I think, the best form for case 1 is `template void run(F&& frame)`, it prevents the copy of the function object. #include struct A { A() {} A(const A&) { std::cout << "copied" << std::endl; } }; template void f1(F&& f) { f(); } template void f2(F&& f) { f1(f); } int main() { f2([a = A()] {}); return 0; } – Hossein Noroozpour Dec 31 '21 at 12:59
  • A lambda with a non-empty capture is substantially different than one with an empty capture (stateful vs. stateless), so may not be a useful demonstration that any copying really happened in the original. – Useless Jan 01 '22 at 15:05
0

Compile for release and check the list files, or turn on disassembly in debugger. Best way to know is to check the generated code.