How can I determine if a compiler uses early or late binding on a virtual function?

Question

I have the following code:

class Pet {
public:
  virtual string speak() const { return ""; }
};

class Dog : public Pet {
public:
  string speak() const { return "Bark!"; }
};

int main() {
  Dog ralph;
  Pet* p1 = &ralph;
  Pet& p2 = ralph;
  Pet p3;

  // Late binding for both:
  cout << "p1->speak() = " << p1->speak() <<endl;
  cout << "p2.speak() = " << p2.speak() << endl;

  // Early binding (probably):
  cout << "p3.speak() = " << p3.speak() << endl;
}

I have been asked to determine whether the compiler uses early or late binding for the final function call. I have searched online but have found nothing to help me. Can someone tell me how I would carry out this task?

Btw. what exact definition of late and early binding are you using? probably not those from wikipedia (http://en.wikipedia.org/wiki/Early_binding)? — PlasmaHH, Sep 30 '11 at 13:47
@PlasmaHH: hmm. the link says early binding, but the page's title is late binding ;) — BlackBear, Sep 30 '11 at 13:59
@BlackBear: Its wikipedia. Note the " (Redirected from Early binding)". The ponit is that the OP probably mean dynamic vs. static dispatch. — PlasmaHH, Sep 30 '11 at 14:00
Im not sure what you mean? Basically my understanding is early binding is done at compile time, late binding is at runtime. — Bap Johnston, Sep 30 '11 at 14:02
There's an intermediate case: speculative binding. Make a direct call to the predicted function, check the dynamic type, and if necessary (mispredicted) do a virtual call. — MSalters, Sep 30 '11 at 14:19

score 5 · Accepted Answer · answered Sep 30 '11 at 13:42

5

You can look at the disassembly, to see whether it appears to be redirecting through a vtable.

The clue is whether it calls directly to the address of the function (early binding) or calls a computed address (late binding). The other possibility is that the function is inlined, which you can consider to be early binding.

Of course the standard doesn't dictate the implementation details, there may be other possibilities, but that covers "normal" implementations.

answered Sep 30 '11 at 13:42

Steve Jessop

273,490
39
460
699

I have produced tha assembly code for the program however im not sure what the v-table creation looks like, do vtables have a typical name to look out for? – Bap Johnston Sep 30 '11 at 13:58
If you don't see a vtable, you might see a `__vptr` pointer instead. – MSalters Sep 30 '11 at 14:01
1

@Bap: vtable *creation* has nothing to do with it. I'm saying look at the code for the *call*, and see whether it seems to be calling a constant-fixup address, or loading some value out of a location that depends somehow on the contents of the object (especially the contents of the first `sizeof(void*)` bytes of the object, which is where the virtual pointer usually lives). As MSalters says, you might catch sight of a name of some internal working of the compiler, perhaps `__vptr`, but that depends on the implementation and on the disassembler. – Steve Jessop Sep 30 '11 at 14:22

score 3 · Answer 2 · answered Sep 30 '11 at 14:34

3

You can always use hack :D

//...
Pet p3;
memset(&p3, 0, sizeof(p3));
//...

If compiler does use vtbl pointer, guess what will gonna happen :>

p3.speak()  // here

answered Sep 30 '11 at 14:34

GreenScape

7,191
2
34
64

score 2 · Answer 3 · answered Sep 30 '11 at 13:39

2

Look at the generated code. E.g. in Visual Studio you can set a breakpoint, then right-click and select "Go To Disassembly".

answered Sep 30 '11 at 13:39

Henrik

23,186
6
42
92

true for the general case of the question – sehe Sep 30 '11 at 13:41

score 2 · Answer 4 · answered Sep 30 '11 at 13:39

2

It uses early binding. You have an object of type P3. While it is a base class with a virtual function definition, the type is concrete and known at compile time, so it doesn't have to consider the virtual function mapping to derived classes.

This is much the same as if you called speak() in the Pet constructor - even when making derived objects, when the base class constructor is executing the type of the object is that of the base so the function would not use the v-table, it would call the base type's version.

Basically, early binding is compile time binding and late binding is run-time binding. Run time binding is only used in instances where the compiler doesn't have enough type information at compile time to resolve the call.

answered Sep 30 '11 at 13:39

John Humphreys

37,047
37
155
255

1

That is just a possible optimization, the compiler is not forced to do it that way, as such the only way to be sure is to look in the assembly (it might even change with compiler seetings) – PlasmaHH Sep 30 '11 at 13:41
I'm almost positive that in situations where the type is clearly the base type, the language/compiler will always execute the base type function and the v-table will not be considered. The book Effective C++ usually looks at all catch-22's and it didn't present anything when covering that at least :/ Where did you learn it's just an optimization for the compiler? – John Humphreys Sep 30 '11 at 13:43
1

This is true for the `p3.speak()` call, but it cannot be said in general for the other calls without inspecting the generated code, even though the virtual call is very likely to be optimized away in both cases. – Nicola Musatti Sep 30 '11 at 13:44
I'd agree with Nicola's comment :) the other cases are definitely a lot less clear than p3.speak() in this regard. I wouldn't be sure what happened in them - though I can take a pretty good guess. – John Humphreys Sep 30 '11 at 13:45
@PlasmaHH: In the last case the compiler *must* perform static binding, because the call is performed on a class instance, rather than a pointer or a reference. – Nicola Musatti Sep 30 '11 at 13:47
There's no difference in the "observable behavior" of the program whether it uses the virtual or non-virtual mechanism, so it's safely in the realm of "up to the implementation to decide how to get it done". Since in this case you expect the early binding to be more efficient, that makes it an optimization issue. You're certainly right that in the case of `p3` there's no obvious reason to use the virtual mechanism. – Steve Jessop Sep 30 '11 at 13:48
This is an excerpt from the book im using: The compiler knows the exact type and that it’s an object, so it can’t possibly be an object derived from Pet – it’s exactly a Pet. Thus,early binding is probably used. However, if the compiler doesn’t want to work so hard, it can still use late binding and the same behavior will occur. This would lead me to believe that late binding is more efficient, is this wrong? – Bap Johnston Sep 30 '11 at 13:48
@NicolaMusatti: Where does the standard state that? on assembler level the compiler handles things via pointers anyways, since it has at least to pass `this` as a pointer somehow (often in a register). – PlasmaHH Sep 30 '11 at 13:49
1

@Nicola: be careful - logically, it's a non-virtual call since the dynamic type of `p3` is certainly the same as its static type, whereas logically `p1` and `p3` are virtual calls since the dynamic type is different from the static type. However there's a difference between which type of call it is, vs what call mechanism the implementation actually uses. It's permitted to optimise the virtual calls since data flow analysis can tell the dynamic type, and it's permitted to "pessimise" the non-virtual call since a compiler is permitted by the standard to waste time wherever it likes. – Steve Jessop Sep 30 '11 at 13:52
OK, I'll rephrase: only a very silly implementation would use late binding for a non-virtual call. At the assembler level early binding translates to a direct load of the function address, while late binding requires calculating where the function address is stored and loading it from that location. – Nicola Musatti Sep 30 '11 at 13:57
@BapJohnston Late binding is less efficient by a long shot. In late binding instead of having a direct call, the v-table has to be navigated to find the correct type's variation of the function to call. It's an extra level of indirection (at minimum). Also, the compiler may make interesting optimizations with early binding since it knows all the information ahead of time - it can't do much with late binding. – John Humphreys Sep 30 '11 at 13:58
No, late binding is less efficient than early binding, as I explained in my previous comment. – Nicola Musatti Sep 30 '11 at 14:03
@SteveJessop: The "as if" rule is to be taken "cum grano salis" even if the standard doesn't explicitly state so. An implementation that worked as you suggest would have faded into oblivion a long, long time ago. – Nicola Musatti Sep 30 '11 at 14:06
@Bap: you're saying that late binding is "less efficient", meaning that the compiler has to do less work. Whether it's actually true that the compiler has to do extra work in order to make a non-virtual call here is debateable, but even if it is true, when we talk about "efficiency" we generally mean the efficiency of the code emitted, that is the run time cost. The time it takes to compile the code is a separate thing, and in C++ you're usually happy for the compiler to take some time to produce faster or smaller emitted code. – Steve Jessop Sep 30 '11 at 14:12
@Nicola: sure, but the question is how to find out what the compiler did, whereas w00te's answer and your comments are talking about what the compiler *would* do, if it's any good (and using the word "must" to describe that state of affairs). Predicting what the compiler will do is completely separate from confirming whether the prediction is correct. – Steve Jessop Sep 30 '11 at 14:15
The question sounded more like an interview question where they wanted a quick answer on what would happen to me, but I'll concede that if you were going into the detailed state of things it probably is a compiler decision. Honestly, I don't know if the standard addresses this or not. – John Humphreys Sep 30 '11 at 14:28

score 1 · Answer 5 · answered Sep 30 '11 at 13:57

In fact the compiler has no obligation to use either one particularly, just to make sure that the right function is called. In this case, your object is of the concrete type Pet, so as long as Pet::speak is called the compiler is "doing the right thing".

Now, given that the compiler can statically see the type of the object, I suspect that most compilers will optimize away the virtual call but there is no requirement that they do so.

If you want to know what your particular compiler is doing the only way is to consult its documentation, source code, or the generated disassembly.

I have looked at the assembled code, im just cant see anything resembling vtable creation, I think I need to look up the documentation to see what to see what i should be looking for. — Bap Johnston, Sep 30 '11 at 14:09

score 0 · Answer 6 · answered Sep 30 '11 at 14:34

I just thought of a way to tell at runtime, without guesswork. You can simply override the vptr of your polymorphic classes with 0 and see if the method is called or if you get a segmentation fault. This is what I get for my example:

Concrete: Base
Concrete: Derived
Pointer: Base
Pointer: Derived
DELETING VPTR!
Concrete: Base
Concrete: Derived
Segmentation fault

Where Concrete: T means that calling the virtual member function of T through a concrete type was successful. Analogously, Pointer: T says that calling the member function of T through a Base pointer was successful.

For reference, this is my test program:

#include <iostream>
#include <string.h>

struct Base {
  unsigned x;
  Base() : x(0xEFBEADDEu) {
  }
  virtual void foo() const {
    std::cout << "Base" << std::endl;
  }
};

struct Derived : Base {
  unsigned y;
  Derived() : Base(), y(0xEFCDAB89u) {
  }
  void foo() const {
    std::cout << "Derived" << std::endl;
  }
};

template <typename T>
void dump(T* p) {
  for (unsigned i = 0; i < sizeof(T); i++) {
    std::cout << std::hex << (unsigned)(reinterpret_cast<unsigned char*>(p)[i]);
  }
  std::cout << std::endl;
}

void callfoo(Base* b) {
  b->foo();
}

int main() {
  Base b;
  Derived d;
  dump(&b);
  dump(&d);
  std::cout << "Concrete: ";
  b.foo();
  std::cout << "Concrete: ";
  d.foo();
  std::cout << "Pointer: ";
  callfoo(&b);
  std::cout << "Pointer: ";
  callfoo(&d);
  std::cout << "DELETING VPTR!" << std::endl;
  memset(&b,0,6);
  memset(&d,0,6);
  std::cout << "Concrete: ";
  b.foo();
  std::cout << "Concrete: ";
  d.foo();
  std::cout << "Pointer: ";
  callfoo(&b);
  std::cout << "Pointer: ";
  callfoo(&d);
  return 0;
}

How can I determine if a compiler uses early or late binding on a virtual function?

6 Answers6