For a concrete type T
, sizeof
means two things:
the representation of a complete object a
of type T
occupies only sizeof(T)
bytes in [(char*)&a
, (char*)&a + sizeof(T)
);
an array of T
stores the second object sizeof(T)
after the first.
The bytes occupied by complete objects don't overlap: either one is a subject of the other and is contained in it, or they have no bytes in common.
You can overwrite a complete object (with memset
) and then use placement new to reconstruct it (or simply assignment for objects without meaningful construction), and everything will be fine if the destructor wasn't important (don't do that if the destructor is responsible for the release of a resource). You can't overwrite only a base class subobject as it will wreck the complete object. sizeof
tells you how many bytes you can overwrite without breaking other objects.
Data members of a class are complete objects, so the size of a class is always at least the sum of the sizes of its members.
Some types are "full": every bit in the object is meaningful; notably, unsigned char
. Some types have unused bits or bytes. Many classes have such "holes" for padding. An empty class has zero meaningful bit: no bit is part of the state, as there is no state. An empty class is a concrete class, and but instantiated; every instance has an identity hence a distinct address, so its size couldn't be zero even if the standard allowed zero values of sizeof
. An empty class is pure padding.
Consider:
struct intchar {
int i;
char c;
};
The alignment of intchar
is the alignment of int
. On typical system where sizeof(int)
is 4 and alignment of these basic types is equal to size,
so intchar
has alignment 4 and size 8, because the size corresponds to the distance between two array elements, so 3 bytes aren't used for the representation.
Given intchar_char
struct intchar_char {
intchar ic;
char c;
};
the size must be greater than the size of intchar
even with unused bytes exist in ic
because of alignment: member ic
is a complete object and occupies all its bytes, and memset
is permitted in this object.
sizeof
is well defined only for concrete types (that can be instantiated) and complete objects. So you need sizeof
to determine the size of a empty class if you want to create arrays of such; but for base class subobject, sizeof
doesn't give you the information you want.
There is no operator in C++ to measure how many bytes are used in the representation of a class, but you can try with a derived class:
template <class Base, int c=1>
struct add_chars : Base {
char dummy[c];
};
template <class T>
struct has_trailing_unused_space {
static const bool result = sizeof (add_chars<T>) == sizeof (T);
};
Note that add_chars<T>
doesn't have a member of type T
, so there is no T
complete object and memset
is not allowed on the intchar
subobject. dummy
is a complete object that can't overlap with any other complete object but it can overlap with a base class subobject.
The size of a derived class isn't always at least the sum of the sizes of its subojects.
The member dummy
occupies exactly one byte; if there is any trailing byte in Base
, most compilers will allocate dummy
in the unused space; has_trailing_unused_space
tests this property.
int main() {
std::cout << "empty has trailing space: ";
std::cout << has_trailing_unused_space<empty>::result;
}
outputs:
empty has trailing space: 1
virtual inheritance
When considering the layout of classes involving virtual functions and virtual bases classes, you need to consider the hidden vptr and internal pointers. They will have the same properties (size and alignment) as a void*
in typical implementations.
class Derived2 : virtual public Empty
{};
Unlike normal inheritance and membership, virtual inheritance doesn't define a strict, direct, ownership relation, but a shared, indirect ownership, just like calling a virtual function introduces an indirection. Virtual inheritance creates two sorts of class layout: base class subobject and complete object layouts.
When a class is instantiated, the compiler will use the layout defined for complete objects, which can be, using a vptr as GCC does and the Titanium ABI stipulates:
struct Derived2 {
void *__vptr;
};
The vptr points to a complete vtable, with all the runtime information, but the C++ language doesn't consider such class to be polymorphic class, so dynamic_cast
/typeid
can't be used to determine the dynamic type.
AFAIK, Visual C++ doesn't use a vptr but a pointer to subobject:
struct Derived2 {
Empty *__ptr;
};
And other compilers could use a relative offset:
struct Derived2 {
offset_t __off;
};
Derived2
is very simple class; the subobject layout of Derived2
is the same as its complete object layout.
No consider a slightly more involved case:
struct Base {
int i;
};
struct DerV : virtual Base {
int j;
};
Here the complete layout of DerV
might be (Titanium ABI-style):
struct complete__DerV {
void *__vptr;
int j;
Base __base;
};
The subobject layout is
struct DerV {
void *__vptr;
int j;
};
All complete or incomplete objects of type DerV
have this layout.
The vtable contains the relative offsets of the virtual base: offsetof(complete__DerV,__base)
in case of an object of dynamic type DerV
.
A call to a virtual function can be done by looking up the overrider at runtime or by knowing the dynamic type by language rules.
An upcast (conversion of a pointer to virtual base class), which often happens implicitly when a member function is called on a base class:
struct Base {
void f();
};
struct DerV : virtual Base {
};
DerV d;
d.f(); // involves a derived to base conversion
either uses the known offset when the dynamic type is known, as here, or uses the runtime information to determine the offset:
void foo (DerV &d) {
d.f(); // involves a derived to base conversion
}
can be translated to (Titanium ABI-style)
void foo (DerV &d) {
(Base*)((char*)&d + d.__vptr.off__Base)->f();
}
or Visual C++-style:
void foo (DerV &d) {
d.__ptr->f();
}
or even
void foo (DerV &d) {
(Base*)((char*)&d + d.__off)->f();
}
The overhead depends on the implementation, but it's there whenever the dynamic type isn't known.