1

The C++ object model is such that it does not contain any table for non virtual member functions. When there is a call of such a function

a.my_function();

with name mangling it becomes something like

my_function__5AclassKd(&a)

The object contains only data members. There are no table for non virtual functions. So in such a circumstances how calling mechanism finds out which function to call? What's going on under the hood?

Brad Larson
  • 170,088
  • 45
  • 397
  • 571
  • 4
    That is implementation dependent, but in general it would be a hard coded jump to an address. – t.niese Feb 02 '18 at 19:08
  • Regarding your question, nearly everything "under the hood" is implementation dependent, but you can certainly see what's going on with *your* current implementation by writing a simple test program, compiling it to asm, and examining that. – WhozCraig Feb 02 '18 at 19:08
  • 1
    Do you understand mechanism to call regular free functions (as `foo(42)`) ? – Jarod42 Feb 02 '18 at 19:10
  • 2
    No table is needed for non virtual calls, the function name directly corresponds to the function location. – Galik Feb 02 '18 at 19:10
  • 2
    The "calling mechanism" doesn't figure out what function to call, the compiler does. The compiler keeps a table of all the functions in each data type, it has its own memory and doesn't need to place them in the runtime memory of the objects. – Ben Voigt Feb 02 '18 at 19:13
  • You all are saying the compiler knows and my question about how exactly the compiler knows about it? Please do not explain me what is virtual functions, run-time calling mechanism and be more focused on the actual question. –  Feb 02 '18 at 19:23
  • 1
    @ՎարդանԳրիգորյան Somehow related: https://stackoverflow.com/questions/48550968/polymorphism-without-virtual-functions/48551288#48551288 –  Feb 02 '18 at 19:27

3 Answers3

3

Formally the standard doesn't require them to work in any specific way, but usually they work exactly like plain functions, but with an extra invisible parameter: a pointer to the object instance they're called on.

Of course a compiler might be able to optimize that, e.g. don't pass the pointer if the member function doesn't use this or any member variables or member functions requiring this.

HolyBlackCat
  • 78,603
  • 9
  • 131
  • 207
  • Ok, but that `this` pointer actually knows nothing about where to find out the called function. `this` do not contain f.e function address. –  Feb 02 '18 at 19:14
  • 1
    @ՎարդանԳրիգորյան it doesn't *have* to. It isn't virtual, and thus cannot participate in polymorphic alteration. The call to the function is the call to *the function*. The same way the compiler knows where to find `printf`. The compiler has the class declaration; knows it exists, and emits a call to it (pushing the implicit this). The linker eventually fixes up the call, and that's that. Being a member function makes little difference; being non-virtual makes a *world* of difference. That's one (and likely your) implementation possibilities. – WhozCraig Feb 02 '18 at 19:19
  • You are saying the compiler knows and my question about this how it knows. Please do not explain me what is virtual function and all these thing. I know about them and it do not concern to the actual question. –  Feb 02 '18 at 19:22
  • 1
    @ՎարդանԳրիգորյան When you write `object.function(...)`, the compiler always knows the type of `object`. Thus it can look into the list of member functions for that type and pick one with the name you used (an appropriate overload of it). – HolyBlackCat Feb 02 '18 at 19:24
  • 1
    The compiler knows because it's already parsed the class definition, e.g. `class A { public: void myfunction(); };` earlier in the translation unit. – Daniel Schepler Feb 02 '18 at 19:26
  • HolyBlackCat, Daniel Schepler - Good answers thanks. –  Feb 02 '18 at 19:28
2

The compiler's job is to lay out the data and code the program needs into memory addresses. Each non-virtual function - whether member or non-member - gets a fixed virtual memory address at which it can be called. Calling machine code then hardcodes an absolute (or with position independent code a calling-address-relative offset) address of the function to call.

For example, say your compiler is compiling a non-virtual member function that takes 20 bytes of machine code, and it's putting the executable code at virtual addresses from offset 0x1000 and has already generated 10 bytes of executable code for other functions, then it will start the code of this function at virtual address 0x100A. Code that wants to call the function then generates machine code for "call 0x100A" after pushing any function call arguments (including a this pointer to the object to be operated upon) onto the stack.

You can easily see all this happening:

 ~/dev > cat example.cc         
#include <cstdio>

struct X
{
    int f(int n) { return n + 3; }
};

int main()
{
    X x;
    printf("%d\n", x.f(7));
}

~/dev > g++ example.cc -S; c++filt < example.s
    .file   "example.cc"
    .section    .text._ZN1X1fEi,"axG",@progbits,X::f(int),comdat
    .align 2
    .weak   X::f(int)
    .type   X::f(int), @function
X::f(int):    // code to execute X::f(int) starts at label .LFB0
.LFB0:        // when this assembly is covered to machine code
    .cfi_startproc    // it's given a virtual address
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    movq    %rdi, -8(%rbp)
    movl    %esi, -12(%rbp)
    movl    -12(%rbp), %eax
    addl    $3, %eax
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE0:
    .size   X::f(int), .-X::f(int)
    .section    .rodata
.LC0:
    .string "%d\n"
    .text
    .globl  main
    .type   main, @function
main:
.LFB1:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    subq    $16, %rsp
    movq    %fs:40, %rax
    movq    %rax, -8(%rbp)
    xorl    %eax, %eax
    leaq    -9(%rbp), %rax
    movl    $7, %esi
    movq    %rax, %rdi
    call    X::f(int)     // call non-member member function
                          //   machine code will hardcoded address
    movl    %eax, %esi    
    leaq    .LC0(%rip), %rdi
    movl    $0, %eax
    call    printf@PLT
    movl    $0, %eax
    movq    -8(%rbp), %rdx
    xorq    %fs:40, %rdx
    je  .L5
    call    __stack_chk_fail@PLT
.L5:
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE1:
    .size   main, .-main
    .ident  "GCC: (Ubuntu 7.2.0-8ubuntu3) 7.2.0"
    .section    .note.GNU-stack,"",@progbits

If you compile a program then look at the disassembly it'll usually show the actual virtual address offsets too.

Tony Delroy
  • 102,968
  • 15
  • 177
  • 252
1

With non-virtual functions, there is no need to determine at runtime which function to call; so the resulting machine code will typically look the same as a normal function call, just with an extra argument for this as indicated in your example. (Though it's not always identical - for example, I think MSVC compiling 32-bit programs, in at least some versions, passes this in the ECX register instead of on the stack as for usual function parameters.)

Thus, the determination of which function to call is made by the compiler at compile time. At that time, it has the information determined from parsing class declarations that it can use, for example to do method overload resolution, and from there to either calculate or look up the mangled name to put into assembly code.

Daniel Schepler
  • 3,043
  • 14
  • 20