1

I am reading Inside the C++ Object Model. In section 1.3

So, then, why is it that, given

Bear b; 
ZooAnimal za = b; 

// ZooAnimal::rotate() invoked 
za.rotate(); 

the instance of rotate() invoked is the ZooAnimal instance and not that of Bear? Moreover, if memberwise initialization copies the values of one object to another, why is za's vptr not addressing Bear's virtual table?

The answer to the second question is that the compiler intercedes in the initialization and assignment of one class object with another. The compiler must ensure that if an object contains one or more vptrs, those vptr values are not initialized or changed by the source object .

So I wrote the test code below:

#include <stdio.h>
class Base{
public:
    virtual void vfunc() { puts("Base::vfunc()"); }
};
class Derived: public Base
{
public:
    virtual void vfunc() { puts("Derived::vfunc()"); }
};
#include <string.h>

int main()
{
    Derived d;
    Base b_assign = d;
    Base b_memcpy;
    memcpy(&b_memcpy, &d, sizeof(Base));

    b_assign.vfunc();
    b_memcpy.vfunc();

    printf("sizeof Base : %d\n", sizeof(Base));

    Base &b_ref = d;
    b_ref.vfunc();

    printf("b_assign: %x; b_memcpy: %x; b_ref: %x\n", 
        *(int *)&b_assign,
        *(int *)&b_memcpy,
        *(int *)&b_ref);
    return 0;
}

The result

Base::vfunc()
Base::vfunc()
sizeof Base : 4
Derived::vfunc()
b_assign: 80487b4; b_memcpy: 8048780; b_ref: 8048780

My question is why b_memcpy still called Base::vfunc()

curiousguy
  • 8,038
  • 2
  • 40
  • 58
Divlaker
  • 401
  • 6
  • 16
  • Possible duplicate of [Why would the behavior of std::memcpy be undefined for objects that are not TriviallyCopyable?](http://stackoverflow.com/questions/29777492/why-would-the-behavior-of-stdmemcpy-be-undefined-for-objects-that-are-not-triv) – Ken Y-N Dec 12 '16 at 04:28
  • I suppose answer is disassembler, and hint is `&b_memcpy == &b_ref` – fghj Dec 12 '16 at 04:32
  • 2
    The behavior is undefined. Any outcome you see can easily be different with a different compiler, different compiler options, optimizations, etc. – PaulMcKenzie Dec 12 '16 at 04:48
  • I changed my test code to [this](http://ideone.com/nGnvdY) and the result is same as I expected. – Divlaker Dec 12 '16 at 05:07
  • @PaulMcKenzie The vptr doesn't even official exist, so any manipulation of a vptr is going to be unspecified or UB. Still, some manipulations should work, as long as we can plausibly claim the rules weren't broken. – curiousguy Jan 28 '17 at 00:20

4 Answers4

3

What you are doing is illegal in C++ language, meaning that the behavior of your b_memcpy object is undefined. The latter means that any behavior is "correct" and your expectations are completely unfounded. There's not much point in trying to analyze undefined behavior - it is not supposed to follow any logic.

In practice, it is quite possible that your manipulations with memcpy did actually copy Derived's virtual table pointer to b_memcpy object. And your experiments with b_ref confirm that. However, when a virtual method is called though an immediate object (as is the case with b_memcpy.vfunc() call) most implementations optimize away the access to the virtual table and perform a direct (non-virtual) call to the target function. Formal rules of the language state that no legal action can ever make b_memcpy.vfunc() call to dispatch to anything other than Base::vfunc(), which is why the compiler can safely replace this call with a direct call to Base::vfunc(). This is why any virtual table manipulations will normally have no effect on b_memcpy.vfunc() call.

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
2

The behavior you've invoked is undefined because the standard says it's undefined, and your compiler takes advantage of that fact. Lets look at g++ for a concrete example. The assembly it generates for the line b_memcpy.vfunc(); with optimizations disabled looks like this:

lea     rax, [rbp-48]
mov     rdi, rax
call    Base::vfunc()

As you can see, the vtable wasn't even referenced. Since the compiler knows the static type of b_memcpy it has no reason to dispatch that method call polymorphically. b_memcpy can't be anything other than a Base object, so it just generates a call to Base::vfunc() as it would with any other method call.

Going a bit further, lets add a function like this:

void callVfunc(Base& b)
{
  b.vfunc();
}

Now if we call callVfunc(b_memcpy); we can see different results. Here we get a different result depending on the optimization level at which I compile the code. On -O0 and -O1 Derived::vfunc() is called and on -O2 and -O3 Base::vfunc() is printed. Again, since the standard says the behavior of your program is undefined, the compiler makes no effort to produce a predictable result, and simply relies on the assumptions made by the language. Since the compiler knows b_memcpy is a Base object, it can simply inline the call to puts("Base::vfunc()"); when the optimization level allows for it.

Miles Budnek
  • 28,216
  • 2
  • 35
  • 52
  • Thank you for your answer. I have understood. Only using pointer or reference can trigger the vptr. – Divlaker Dec 12 '16 at 05:15
0

You aren't allowed to do

memcpy(&b_memcpy, &d, sizeof(Base));

- it's undefined behaviour, because b_memcpy and d aren't "plain old data" objects (because they have virtual member functions).

If you wrote:

b_memcpy = d;

then it would print Base::vfunc() as expected.

user253751
  • 57,427
  • 7
  • 48
  • 90
  • I just test my idea. You can see sizeof(Base) is 4, so I just copy the vptr pointer. – Divlaker Dec 12 '16 at 04:48
  • @Divlaker Indeed, but you're not allowed to copy the vptr pointer. – user253751 Dec 12 '16 at 20:44
  • @immibis You are not officially allowed to do it, ctors can. You need to have an excuse to run a ctor. – curiousguy Jan 28 '17 at 00:21
  • @immibis I am not sure what you mean by "copy". I know of reads and writes, not copies. – curiousguy Jan 28 '17 at 01:06
  • @curiousguy I mean whatever you meant. You're the one who said that "ctors could [copy the vptr pointer]" – user253751 Jan 28 '17 at 01:33
  • @immibis No. I said nothing about "copy" as I don't know what counts as one. Is `x=y=0;` a copy? What about `y=0; x=0;` I said that only ctors can modify a vptr. That modify could count as a "copy" if you (and your god) want. – curiousguy Jan 28 '17 at 01:44
  • @curiousguy Copying something means reading a value from one object and writing it to another object. In this case it would be reading `d`'s vptr and writing it to `b_memcpy`. I also find it quite ridiculous that you don't know the meaning of "copy" here especially since the question illustrates it with code. – user253751 Jan 28 '17 at 01:47
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/134232/discussion-between-curiousguy-and-immibis). – curiousguy Jan 28 '17 at 01:48
0

Any use of a vptr is outside the scope of the standard

Granted, the use of memcpy here has UB

The answers pointing out that any use of memcpy, or other byte manipulation of non-PODs, that is, of any object with a vptr, has undefined behavior, are strictly technically correct but do not answer the question. The question is predicated on the existence of a vptr (vtable pointer) which isn't even mandated by the standard: of course the answer will involve facts outside the standard and the result bill not be guaranteed by the standard!

Standard text is not relevant regarding the vptr

The issue is not that you are not allowed to manipulate the vptr; the notion of being allowed by the standard to manipulate anything that is not even described in the standard text is absurd. Of course not standard way to change the vptr will exist and this is beside the point.

The vptr encodes the type of a polymorphic object

The issue here is not what the standard says about the vptr, the issue is what the vptr represents, and what the standard says about that: the vptr represents the dynamic type of an object. Whenever the result of an operation depends on the dynamic type, the compiler will generate code to use the vptr.

[Note regarding MI: I say "the" vptr (as if the only one vptr), but when MI (multiple inheritance) is involved, objects can have more than one vptr, each representing the complete object viewed as a particular polymorphic base class type. (A polymorphic class is a class with a least one virtual function.)]

[Note regarding virtual bases: I mention only the vptr, but some compilers insert other pointers to represent aspects of the dynamic type, like the location of virtual base subobjects, and some other compilers use the vptr for that purpose. What is true about the vptr is also true about these other internal pointers.]

So a particular value of the vptr corresponds to a dynamic type: that is the type of most derived object.

Changes of the dynamic type of an object during its lifetime

During construction, the dynamic type changes, and that is why virtual function calls from inside the constructor can be "surprising". Some people say that the rules of calling virtual functions during construction are special, but they are absolutely not: the final overrider is called; that override is the one the class corresponding to the most derived object that has been constructed, and in a constructor C::C(arg-list), it is always the type of the class C.

During destruction, the dynamic type changes, in the reverse order. Calls to virtual function from inside destructors follow the same rules.

What it means when something is left undefined

You can do low level manipulations that are not sanctioned in the standard. That a behavior is not explicitly defined in the C++ standard does not imply that it is not described elsewhere. Just because the result of a manipulation is explicitly described has having UB (undefined behavior) in the C++ standard does not mean your implementation cannot define it.

You can also use your knowledge of the way the compilers work: if strict separate compilation is used, that is when the compiler can get no information from separately compiled code, every separately compiled function is a "black box". You can use this fact: the compiler will have to assume that anything that a separately compiled function could do will be done. Even with inside a given function, you can use asm directive to get the same effects: an asm directive with no constraint can do anything that is legal in C++. The effect is a "forget what you know from code analysis at that point" directive.

The standard describes what can change the dynamic type, and nothing is allowed to change it except construction/destruction, so only an "external" (blackbox) function is is otherwise allowed to perform construction/destruction can change a dynamic type.

Calling constructors on an existing object is not allowed, except to reconstruct it with the exact same type (and with restrictions) see [basic.life]/8 :

If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if:

(8.1) the storage for the new object exactly overlays the storage location which the original object occupied, and

(8.2) the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and

(8.3) the type of the original object is not const-qualified, and, if a class type, does not contain any non-static data member whose type is const-qualified or a reference type, and

(8.4) the original object was a most derived object ([intro.object]) of type T and the new object is a most derived object of type T (that is, they are not base class subobjects).

This means that the only case where you could call a constructor (with placement new) and still use the same expressions that used to designate the objects (its name, pointers to it, etc.) are those where the dynamic type would not change, so the vptr would still be the same.

On other words, if you want to overwrite the vptr using low level tricks, you could; but only if you write the same value.

On other words, don't try to hack the vptr.

curiousguy
  • 8,038
  • 2
  • 40
  • 58