54

I found that both MSVC and GCC compilers allocate at least one byte per each class instance even if the class is a predicate with no member variables (or with just static member variables). The following code illustrates the point.

#include <iostream>

class A
{
public:
   bool operator()(int x) const
   {
      return x>0;
   }
};

class B
{
public:
   static int v;
   static bool check(int x)
   {
      return x>0;
   }
};

int B::v = 0;

void test()
{
   A a;
   B b;
   std::cout << "sizeof(A)=" << sizeof(A) << "\n"
             << "sizeof(a)=" << sizeof(a) << "\n"
             << "sizeof(B)=" << sizeof(B) << "\n"
             << "sizeof(b)=" << sizeof(b) << "\n";
}

int main()
{
   test();
   return 0;
}

Output:

sizeof(A)=1
sizeof(a)=1
sizeof(B)=1
sizeof(b)=1

My question is why does compiler need it? The only reason that I can come up with is ensure that all member var pointers differ so we can distinguish between two members of type A or B by comparing pointers to them. But the cost of this is quite severe when dealing with small-size containers. Considering possible data alignment, we can get up to 16 bytes per class without vars (?!). Suppose we have a custom container that will typically hold a few int values. Then consider an array of such containers (with about 1000000 members). The overhead will be 16*1000000! A typical case where it can happen is a container class with a comparison predicate stored in a member variable. Also, considering that a class instance should always occupy some space, what type of overhead should be expected when calling A()(value) ?

bkxp
  • 1,115
  • 1
  • 12
  • 20
  • 7
    Just to confirm your suspicion: *Unless it is a bit-field (9.6), a most derived object shall have a non-zero size and shall occupy one or more bytes of storage. Base class subobjects may have zero size.* – chris Sep 11 '14 at 12:00
  • 3
    FYI: subobjects of zero size *are* allowed. So, if you derive from such an empty class and add another member of size x, chances are that the size of your derived type is also x. This is known as "empty base class optimization" – sellibitze Sep 11 '14 at 12:04
  • 8
    I believe you have several questions overlapping there. There is no point to "store" a huge number of classes with no members in a container. After all, since there is no data, there is no difference between them. However, the fact that classes with no members have a non-zero size in C++ does not mean that classes that do have members would have unnecessary overhead. The memory alignment issue, however, is an independent issue and is not limited to C++. – AlefSin Sep 11 '14 at 12:05
  • 1
    possible duplicate of [C++: What is the size of an object of an empty class?](http://stackoverflow.com/questions/621616/c-what-is-the-size-of-an-object-of-an-empty-class) – wilx Sep 12 '14 at 10:21
  • It might seem a bit artificial, but I suppose one could store elements with "no data" in a container, e.g. on a stack. Perhaps each represented an event - and did something to a global of some sort. Then it would be natural to remove them as they were processed. – Fred Mitchell Sep 16 '14 at 21:40
  • related: there's [boost::compressed_pair](https://stackoverflow.com/q/7694158/995714) to save memory in this case – phuclv Jun 11 '18 at 06:31

4 Answers4

77

It’s necessary to satisfy an invariant from the C++ standard: every C++ object of the same type needs to have a unique address to be identifiable.

If objects took up no space, then items in an array would share the same address.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • 1
    BTW, I guess that if you have a local variable with such an empty `struct` (or `class`) it might be often optimized to take no space at all on the call stack. – Basile Starynkevitch Sep 11 '14 at 12:00
  • 1
    I wonder: since there is no accessible data, could a conforming implementation "allocate" an array of empty classes at a very high address which would never actually be usable? For example one of those "impossible" pointer values on x86_64? – John Zwinck Sep 11 '14 at 12:01
  • Arrays with no values occupy exactly zero bytes, and C++ have no problem dealing with those. Why is it important to distinguish instances by pointers if they are identical by definition and can't be used in any expression by value? – bkxp Sep 11 '14 at 12:02
  • 3
    @bkxp: while two instances of an empty class may be identical by "value semantics" they are not identical by all C++ semantics. &a == &b must return false if a and b are two different instances. And an array of something must have elements which are addressable. – John Zwinck Sep 11 '14 at 12:04
  • I see no point in allowing different instances of classes that can't actually differ. Maybe is it needed to support C++ class inheritance, when the base class has no members, but must still be distinguishable. – bkxp Sep 11 '14 at 12:05
  • 4
    Scratch my previous comment, C++ does not allow arrays of size zero at all. – Konrad Rudolph Sep 11 '14 at 12:09
  • 3
    @bkxp: I think the Empty Base Optimization will surprise you, then. – John Zwinck Sep 11 '14 at 12:09
  • 1
    @Konrad Rudolph: why not? int *arr = new int[0]; compiles fine and requires delete [], hence it is an array. – bkxp Sep 11 '14 at 12:19
  • @John Zwinck: optimization is a tricky thing. Thanks for the hint. In a comment above this type of optimization was mentioned as a workaround for predicates. – bkxp Sep 11 '14 at 12:20
  • 1
    @JohnZwinck: An empty class might very well be POD, which means it can be memcopied. And since it's at least one (padding) byte big, the memory must be both readable and writeable. – MSalters Sep 11 '14 at 12:23
  • @MSalters: good point. But it doesn't have to actually *store* any padding data, right? Like if we made a system where reading these empty instances always gave 0x00 and writing their padding bytes was a no-op? Or do you think that it is a requirement of C++ that padding bytes can actually be used as storage and read back from? – John Zwinck Sep 11 '14 at 12:28
  • @bkxp That’s different: a nested array is an array of arrays. with your `arr`, all you could do is have an array of pointers, so the outer array wouldn’t take zero memory anyway, because its member type is a *pointer* (which takes memory `sizeof(int*)`). I was talking about a static array of arrays. – Konrad Rudolph Sep 11 '14 at 12:31
26

Basically, it's an interplay between two requirements:

  • Two different objects of the same type must be at a different addresses.
  • In arrays, there may not be any padding between objects.

Note that the first condition alone does not require non-zero size: Given

struct empty {};
struct foo { empty a, b; };

the the first requirement could easily be met by having a zero-size a followed by a single padding byte to enforce a different address, followed by a zero-size b. However, given

empty array[2];

that no longer works because a padding between the different objects empty[0] and empty[1] would not be allowed.

rhughes
  • 9,257
  • 11
  • 59
  • 87
celtschk
  • 19,311
  • 3
  • 39
  • 64
  • I'm still trying to figure out the cases when this would be absolutely necessary. One of such cases is avoiding division by zero when calculating N/sizeof(A). – bkxp Sep 11 '14 at 16:20
  • @bkxp: `std::sort( &array[0], &array[whatever its size is]);` – PlasmaHH Sep 12 '14 at 11:06
  • @PlasmaHH: Given that sorting an array of identical objects is a no-op, and with begin=end, any call of `std::sort` is a no-op as well, this specific line would certainly not break. – celtschk Sep 17 '14 at 13:56
  • @celtschk: I think everyones imagination should be good enough to think of anything similar that would break; `std::distance` would have been more to write... – PlasmaHH Sep 17 '14 at 14:23
14

All complete objects must have a unique address; so they must take up at least one byte of storage - the byte at their address.

A typical case where it can happen is a container class with a comparison predicate stored in a member variable.

In this case, you can use the empty base class optimisation: a base subobject is allowed to have the same address as the complete object that it's part of, so can take up no storage. So you can attach the predicate to a class as a (perhaps private) base class rather than a member. It's a bit more fiddly to deal with than a member, but should eliminate the overhead.

what type of overhead should be expected when calling A()(value) ?

The only overhead compared to calling a non-member function will be passing the extra this argument. If the function is inlined, then this should be eliminated (as would be the case, in general, when calling a member function that doesn't access any member variables).

Mike Seymour
  • 249,747
  • 28
  • 448
  • 644
  • 1
    It's worth noting that the overhead of passing `this` is generally small, but becomes larger if the number of arguments exceeds the ABI limitation for passing on the stack due to the unnecessary `this`. If the member function is `inline` however, this can be optimized out. – John Zwinck Sep 11 '14 at 12:08
  • Thanks for explaining. The inheritance trick is quite interesting, although it would require an adapter class to avoid infusion of an extraneous operator() in the container class. – bkxp Sep 11 '14 at 12:14
  • @bkxp: If you use private inheritance, then you _technically_ still have the `operator()` but it's private so nobody else can use it. I would expect virtually every container does this with the allocators. – Mooing Duck Sep 11 '14 at 20:10
1

There are already excellent answers that answer the main question. I would like to address the concerns you expressed with:

But the cost of this is quite severe when dealing with small-size containers. Considering possible data alignment, we can get up to 16 bytes per class without vars (?!). Suppose we have a custom container that will typically hold a few int values. Then consider an array of such containers (with about 1000000 members). The overhead will be 16*1000000! A typical case where it can happen is a container class with a comparison predicate stored in a member variable.

Avoiding the cost of holding A

If all instances of a container depend on type A, then there is no need to hold instances of A in the container. The overhead associated with the non-zero size of A can be avoided by simply creating an instance of A on the stack when needed.

Not being able to avoid the cost of holding A

You may be forced to hold a pointer to A in each instance of the container if A is expected to by polymorphic. For such a containerthe cost of each container goes up by the size of a pointer. Whether there are any member variables in the base class A or not makes no difference to the size of the container.

Impact of sizeof A

In either case, the size of an empty class should have no bearing on the storage requirements of the container.

R Sahu
  • 204,454
  • 14
  • 159
  • 270
  • Less memory requirement for containers vs less time creating temporary objects. I guess you'll have to decide which trade off is better for your use case. – R Sahu Sep 19 '14 at 06:47