3

Let's say I have this class

class Point 
{
  inline float x() const { return v[0]; }
  inline float y() const { return v[1]; }
  inline float z() const { return v[2]; }

  float v[3];
};

And I do:

Point myPoint;
myPoint[0] = 5;

// unrelated code goes here

float myVal = myPoint.x() + 5;

Will GCC on -O2 or -O3 optimize away any calls to x() with just getting v[0]? IE:

float myVal = myPoint.v[0] + 5;

Or is there a reason why this is impossible?

Update: Should mention I do realize inline is more of a suggestion to the compiler than anything else, but was wondering anyways.

As an additional question, will templating this class have any effect on the optimizations that can be done?

Acorn
  • 24,970
  • 5
  • 40
  • 69
Tyler Shellberg
  • 1,086
  • 11
  • 28
  • 1
    Yes it should but there is no guarantee :( Btw I doubt that making it `inline` twice would make any difference – Slava Feb 04 '20 at 21:52
  • 3
    Expanding on @Slava 's comment: A function completely defined inside the body of a class is automatically an `inline` function (Or if there are any differences between the two I don't know them) – user4581301 Feb 04 '20 at 21:54
  • @user4581301 That's very interesting! I did not realize that at all. Now I'm really curious what the repercussions are, since it is not always ideal to mark things as inline, as per this discussion: https://softwareengineering.stackexchange.com/questions/377949/is-inlining-almost-all-of-my-c-applications-methods-a-good-or-bad-idea Or is there something fundamental I'm not seeing as to why it's automatically inline? – Tyler Shellberg Feb 04 '20 at 21:57
  • Obvious solution is to have anonymous `union` but unfortunately that's UB and you can only do that if you wear foil hat (at least I got that from video) – Slava Feb 04 '20 at 21:57
  • 2
    @TylerShellberg `inline` doesn't necessarily cause the function to be inlined. It's merely a hint; if the function is huge, a reasonable compiler will not inline it in any case. – HolyBlackCat Feb 04 '20 at 21:59
  • 2
    @TylerShellberg method defined inside `class` definition is implicitly marked as `inline` and that's since C++98. You can make it explicit if you like but that does not make any change. It is similar to use `virtual` for child class, you can make that explicit but that does not change anything – Slava Feb 04 '20 at 22:00
  • @Slava Is there an explanation somewhere as to why that was done? And do the drawbacks mentioned in the post I linked not apply in this case? – Tyler Shellberg Feb 04 '20 at 22:03
  • See the generated code for youerself – Aykhan Hagverdili Feb 04 '20 at 22:03
  • 2
    @HolyBlackCat strictly speaking it is merely a hit for optimizer, but from language point of view difference is significant - it allows multiple definitions of the function/method. – Slava Feb 04 '20 at 22:03
  • @Ayxan The generated code can vary dramatically depending on the context, I just wanted to know if there was any fundamental reasons it could *never* or would *always* happen. – Tyler Shellberg Feb 04 '20 at 22:04
  • @TylerShellberg is this what you are looking for? https://stackoverflow.com/questions/6108439/why-there-is-no-standard-way-to-force-inline-in-c – Slava Feb 04 '20 at 22:07
  • 2
    Note the actual modern meaning of `inline` in C++ is not really "a hint to the optimizer"; it's "this function can be defined in multiple translation units, and will always be defined in translation units where it is used". Functions defined in header files should be `inline` since they could be included multiple times, and the other part tells the compiler it doesn't necessarily need to generate the definition alone for the linker to use, since the definition will always be visible if it always chooses to inline it. – aschepler Feb 04 '20 at 22:14
  • @user4581301 I meant to ask, is it still marked as inline if only its declaration is in the class, and its definition is in another file? – Tyler Shellberg Feb 04 '20 at 22:32
  • 3
    The full definition must be inside the class to be `inline`. A declaration won't do. As pointed out above, the compiler may have other plans. See the [As If Rule](https://stackoverflow.com/questions/15718262/what-exactly-is-the-as-if-rule): The compiler can do whatever the it wants to the code if you can't tell the difference. – user4581301 Feb 04 '20 at 22:58
  • @user4581301 Much appreciated, thank you. – Tyler Shellberg Feb 04 '20 at 22:59
  • Your code has some errors: All members are `private`, so they cannot be accessed. `myPoint[0] = 5;` doesn't work because `Point` has no `operator[]` overload. You probably meant `myPoint.v[0] = 5;` (after making the member `public`). – walnut Feb 04 '20 at 23:18
  • Some more reading closely related to this q: [LTO](https://www.google.com/search?q=link-time+optimization) and [PGO](https://www.google.com/search?q=profile-guided+optimization). – rustyx Feb 05 '20 at 09:16

3 Answers3

6

Will GCC optimize away an inline accessor?

All optimizing compilers will do so. It is a trivial optimization compared to other ones.

Or is there a reason why this is impossible?

There is no reason that makes it impossible, yet there is no guarantee either.

As an additional question, will templating this class have any effect on the optimizations that can be done?

No. But, of course, a compiler may have different inlining thresholds for templates.

Acorn
  • 24,970
  • 5
  • 40
  • 69
2

You can observe the difference here: https://godbolt.org/

Lets say you have this code (yours does not compile: missing ; and Point has no []):

struct Point 
{
  inline float x() const { return v[0]; }
  inline float y() const { return v[1]; }
  inline float z() const { return v[2]; }

  float v[3];
};

int main() {
    Point myPoint;
    myPoint.v[0] = 5;

    float myVal = myPoint.x() + 5;
    return myVal;
}

Then gcc 9.2 emits:

Point::x() const:
        push    rbp
        mov     rbp, rsp
        mov     QWORD PTR [rbp-8], rdi
        mov     rax, QWORD PTR [rbp-8]
        movss   xmm0, DWORD PTR [rax]
        pop     rbp
        ret
main:
        push    rbp
        mov     rbp, rsp
        sub     rsp, 16
        movss   xmm0, DWORD PTR .LC0[rip]
        movss   DWORD PTR [rbp-16], xmm0
        lea     rax, [rbp-16]
        mov     rdi, rax
        call    Point::x() const
        movss   xmm1, DWORD PTR .LC0[rip]
        addss   xmm0, xmm1
        movss   DWORD PTR [rbp-4], xmm0
        movss   xmm0, DWORD PTR [rbp-4]
        cvttss2si       eax, xmm0
        leave
        ret
.LC0:
        .long   1084227584

I am not that proficient in reading assembler, but I thing comparing the above to the output with -O3 is convincing enough:

main:
        mov     eax, 10
        ret

The generated code can vary dramatically depending on the context, I just wanted to know if there was any fundamental reasons it could never or would always happen.

The above example already disproves the "never". "Always", however is hard to get. The guarantee you get is that the resulting code behaves as if the compiler translated your code without optimizations applied. With few exceptions optimizations are usually not guaranteed. To be really sure I would only rely on looking at the compilers output in the realistic scenario.

463035818_is_not_an_ai
  • 109,796
  • 11
  • 89
  • 185
  • 1
    The optimizer has in this case determined the behavior of the entire program! That's the output you would probably see from `int main() { return 10; }`. – aschepler Feb 04 '20 at 22:09
  • 1
    I think OP asks if he can rely on such fact or not **in general**. If that the case then assembler output for particular code is useless no analize. – Slava Feb 04 '20 at 22:09
  • @ashepler sure but to arrive at that the compiler also had to optimize away the call to the function ;) – 463035818_is_not_an_ai Feb 04 '20 at 22:11
  • @Slava yes, added a note. One example is enough to show that there are cases where the optimization can be applied, I tried to think of a counter example, but didnt find one – 463035818_is_not_an_ai Feb 04 '20 at 22:15
2

It may or may not be inlined. No guarantees there. But if you want it to be always inlined, use the [[gnu::always_inline]] attribute. See the docs here. Use this attribute only if you know what you're doing. In most cases it's best to let the compiler decide what optimizations are suitable.

Aykhan Hagverdili
  • 28,141
  • 6
  • 41
  • 93
  • It's often not wise to assume I'm smarter than the compiler. In this case, is there a reason why forcing these to always be inline could make performance worse? IE, maybe some weird cache size issue or something? – Tyler Shellberg Feb 04 '20 at 22:10
  • @TylerShellberg Yes, you will make performance worse if you inline too much. That is why compilers don't always do it. – Acorn Feb 04 '20 at 22:11
  • @TylerShellberg in a function this small, it's unlikely. But in general, there is no reason to force inlining. Just let the compiler do its optimization. – Aykhan Hagverdili Feb 04 '20 at 22:12
  • 1
    @TylerShellberg https://isocpp.org/wiki/faq/inline-functions#inline-and-perf – François Andrieux Feb 04 '20 at 22:37