7

Consider the following code:

struct A {
    virtual A& operator+=(const A& other) noexcept = 0;
};

void foo_inner(int *p) noexcept { *p += *p; }
void foo_virtual_inner(A *p) noexcept { *p += *p; }

void foo(int *p) noexcept
{
    return foo_inner(p);
}

struct Aint : public A {
    int i;
    A& operator+=(const A& other) noexcept override final
    { 
// No devirtualization of foo_virtual with:
        i += dynamic_cast<const Aint&>(other).i; 
// ... nor with:
//      i += reinterpret_cast<const Aint&>(other).i; 
        return *this;
    }
};

void foo_virtual(Aint *p) noexcept
{
    return foo_virtual_inner(p);
}

As far as I can tell, both foo() and foo_virtual() should compile to the same object code. The compiler has all the information it needs to de-virtualize the call to operator+= in foo_virtual_inner(), when it's called from foo_virtual. But - neither GCC 8.3, nor MSVC 19.10, nor clang 8 do this. Naturally I used the maximum optimization flag (-O3 or /Ox).

Why? Is this a bug, or am I missing something?


clang 8 output:

foo(int*):                               # @foo(int*)
        shl     dword ptr [rdi]
        ret
foo_virtual(Aint*):                  # @foo_virtual(Aint*)
        mov     rax, qword ptr [rdi]
        mov     rax, qword ptr [rax]
        mov     rsi, rdi
        jmp     rax                     # TAILCALL

GCC 8.3 output:

foo(int*):
        sal     DWORD PTR [rdi]
        ret
foo_virtual(Aint*):
        mov     rax, QWORD PTR [rdi]
        mov     rax, QWORD PTR [rax]
        cmp     rax, OFFSET FLAT:Aint::operator+=(A const&)
        jne     .L19
        push    rbx
        xor     ecx, ecx
        mov     edx, OFFSET FLAT:typeinfo for Aint
        mov     esi, OFFSET FLAT:typeinfo for A
        mov     rbx, rdi
        call    __dynamic_cast
        test    rax, rax
        je      .L20
        mov     eax, DWORD PTR [rax+8]
        add     DWORD PTR [rbx+8], eax
        pop     rbx
        ret
.L19:
        mov     rsi, rdi
        jmp     rax
foo_virtual(Aint*) [clone .cold.1]:
.L20:
        call    __cxa_bad_cast

MSVC 19.10 output:

p$ = 8
void foo(int * __ptr64) PROC                                    ; foo
        mov     eax, DWORD PTR [rcx]
        add     eax, eax
        mov     DWORD PTR [rcx], eax
        ret     0
void foo(int * __ptr64) ENDP                                    ; foo

p$ = 8
void foo_virtual(Aint * __ptr64) PROC                  ; foo_virtual
        mov     rax, QWORD PTR [rcx]
        mov     rdx, rcx
        rex_jmp QWORD PTR [rax]
void foo_virtual(Aint * __ptr64) ENDP 

PS - What's the explanation for all of that typeinfo business in the compiled code under GCC?

einpoklum
  • 118,144
  • 57
  • 340
  • 684
  • 1
    It looks like the typeinfo for GCC is because it inlined the call to `operator+=` (and it's contained `dynamic_cast`) after verifying that that is the function being called. What happens at `.L19`? – 1201ProgramAlarm Apr 02 '19 at 00:22
  • Try compiling with LTO enabled and see if this helps the linker to see the errors of its ways. http://blog.llvm.org/2017/03/devirtualization-in-llvm-and-clang.html – Michael Dorgan Apr 02 '19 at 01:57
  • @MichaelDorgan: 1. Can I do that with GodBolt somehow? 2. Why would it be a linker look into the internals of a function that has no external dependency? – einpoklum Apr 02 '19 at 10:04
  • @1201ProgramAlarm: Added the `.L19` and `.L20` lines (they were already available via the link though.) – einpoklum Apr 02 '19 at 17:11
  • Does it help to make `Aint` itself `final`? – Davis Herring Apr 03 '19 at 00:35
  • @DavisHerring: 1. You could check yourself through the GodBolt link. 2. No :-( – einpoklum Apr 03 '19 at 07:56
  • @einpoklum: Sorry, all I could do from a small screen was get your hopes up. – Davis Herring Apr 03 '19 at 13:13

3 Answers3

4

GCC guesses that Aint *p points to instance of Aint *p (but does not think this is guaranteed to happen) and therefore it devirtualises speculatively the call to operator+= and the typeinfo checking is an inlined copy of it. -fno-devirtualize-speculatively leads to same code as Clang and MSVC produces.

_Z11foo_virtualP4Aint:
.LFB4:
        .cfi_startproc
        movq    (%rdi), %rax
        movq    %rdi, %rsi
        movq    (%rax), %rax
        jmp     *%rax
Jan Hubička
  • 609
  • 6
  • 5
  • I'm not sure I understand what you mean. GCC follows a pointer to the vtable and from there to operator+=. That's not de-virtualized, unless I'm missing something. – einpoklum Apr 04 '19 at 12:51
  • 1
    This is how speculative devirtualization works. If you think pointer ptr will point to an instance where virtual method foo is A::foo you replace: ptr->foo () by if (ptr->foo == A::foo) A::foo(a); /*usually inlined*/ else a->foo (); Point is to get the indirect call from the hot path and enable inlining which often enables other optimization (not much interesting is happening in this particular example, but still I would guess the speculatively devirtualized code being a tiny bit faster). – Jan Hubička Apr 04 '19 at 12:54
2

Following @JanHubicka's answer, I've filed a bug against GCC:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89924

and it's being worked on (!).

edit: OK, it wasn't really being worked on after all I guess :-(

einpoklum
  • 118,144
  • 57
  • 340
  • 684
-1

The compiler can't assume that an Aint* actually points to an Aint object until it sees some operation that would have undefined semantics otherwise, like referring to one of its non-static members. Otherwise it could be the result of reinterpret_cast from some other pointer type waiting to be reinterpret_casted back to that type.

It seems to me that the standard conversion to A* should be such an operation, but AFAICT the standard doesn't currently say that. Wording to that effect would need to consider converting to a non-virtual base of an object under construction, which is deliberately allowed.

Jason Merrill
  • 383
  • 3
  • 6