3

I know that there are a lot of similar questions on SO related to what I am about to ask, I've read many of them and still feel a bit vague, so I decided to ask this question.

Given the following code: (which is essentially the same as in this question)

class A {
public:
    int a;
};

class B : public A {
public:
    int b;
};

int main() {
    A* p = new B;
    delete p;
}

I know that deleting p without defining a virtual destructor in A is an undefined behavior specified in the standard and anything could happen thereafter. However, I'm interested in how the size of the polymorphic object on the heap is determined.

I found two sayings on SO which are potentially contradictory: (or I might have misunderstood)

  • We need the virtual destructor to provide the information about the object on the heap for deleting the object through a base class pointer.

    Possible reference.

    In my understanding to such implementation, if we have the virtual destructor, delete p; will first call the destructor of B (and it knows that the first virtual destructor called will correspond to the type of the object that is really on the heap), and the destructor of B knows the size of an object of type B, so the delete operation can extract that information from the first destructor called and use it to free the memory on the heap after the calls of the destructors are done.

    If this is what happens behind the scenes, then delete p; in the above code will not be able to free the heap memory occupied by the int b, because in the above case the first (and the only) destructor called is the destructor of A, and there is no information about int b to be extracted for the delete operation.

  • The heap can "automatically" deduce the size of the polymorphic object.

    Possible references: 1, 2.

    If this is what happens behind the scenes, then delete p; in the above code will be able to free the heap memory occupied by the int b. And in this case delete p; "accidentally" works properly since object of B does not acquire resources from "elsewhere" (e.g. no data members of pointer).

May I ask which one of the two understandings above is correct?

And if the heap can deduce the information about the polymorphic object, may I ask how exactly the heap does that?

CPPL
  • 726
  • 1
  • 10
  • 1
    Note that the C++ standard doesn't specify this sort of implementation detail; both approaches give the same result in all well-defined cases, so implementations are allowed to use either. So it's possible that each answer is correct for at least one implementation even if there's no implementation for which both are correct. – ruakh Jun 14 '22 at 04:30
  • Are you familiar with `malloc` and `free`? – n. m. could be an AI Jun 14 '22 at 04:44
  • The compiler can't get the size from the destructor, but rather it could record the size in the objects vtable. – Goswin von Brederlow Jun 14 '22 at 04:58
  • 2
    The question is similar to the question how the runtime knows how delete an dynamically allocated array, because the size is not fixed there either. – gerum Jun 14 '22 at 06:14

3 Answers3

2

This is part of the ABI.

On the Itanium ABI, the one typically used on Linux, the vtable records two entries for a virtual destructor. One entry calls the destructors and but not delete. This is the complete object destructor. One entry calls the destructors and also delete. This is the deleting destructor. See here.

Since the deleting destructor is virtual and type-specific, it obviously has the size of the complete object.

The other answers get a little mixed up about this. The destruction and the deletion are two logical steps, but it is common to provide a combined form because the requirements are hard to satisfy otherwise. new does not and cannot store size information in the object allocation itself (for non-array cases). And that size is necessary because sized operator delete has preference. And the implementation cannot rely on a particular behavior because these functions are globally replaceable, so a different TU could define them.

Jeff Garrett
  • 5,863
  • 1
  • 13
  • 12
1

new implementations store metadata about the memory they allocate so that delete knows how to free the memory correctly. The size of the object is not calculated by delete, because new already knew the size up front.

Calling destructors is handled via normal polymorphic dispatch. That is why calling delete on a base pointer without a virtual destructor is undefined behavior.

Freeing the allocated memory is a separate operation afterwards, and doesn't need to depend on the type of the class that resides in the memory (unless the class overloads operator delete, that is).

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
  • Since C++14 `delete` generally has to calculate the size and pass it to `operator delete` for a single-object `delete` (assuming the types involved are complete). – user17732522 Jun 14 '22 at 06:17
  • Couldn't `new` just store the size as metadata, and then `delete` could extract the size to pass to `operator delete`? – Remy Lebeau Jun 14 '22 at 06:19
  • `new` has to call `operator new`, which may be replaced by the user anywhere in the program, and only the array-`new` form may ask `operator new` to overallocate memory. So the `new` expression in the single-object case has no space to store that information. – user17732522 Jun 14 '22 at 06:23
  • @user17732522 sounds like a step backwards than how it used to work – Remy Lebeau Jun 14 '22 at 06:30
  • I don't think anything really changed. Even before, you would need to inspect the dynamic type in order to figure out whether the pointer passed to the `delete` expression needs adjustments before being passed to `operator delete`. As far as I can tell this is implemented by having the most-derived virtual destructor call `operator delete`, so it has the size information for free anyway. Of course `operator new`/`operator delete` will then call `malloc`/`free` or something similar, which definitively do their own bookkeeping of sizes or at least block allocations. – user17732522 Jun 14 '22 at 06:35
0

I'm not sure either is correct (of a typical implementation). Destroying an object and deallocating memory are two separate processes, and only the later needs to know the size of the object, or more strictly the size of the block of allocated memory, which may be larger than the size of the object. A common way to do this is to store the size of the block of allocated memory in the bytes immediately preceding the pointer itself.

Also consider that the C++ heap would commonly be implemented by using the C heap API, which has no knowledge of destructors etc.

john
  • 85,011
  • 4
  • 57
  • 81