11

Consider the following declarations of a pair of related structs. The descendant class adds no member variables, and the only member function is a constructor that does nothing but forward all its arguments to the base class's constructor.

struct Base {
  Base(int a, char const* b):
    a(a), b(b)
  { }
  int a;
  char const* b;
};

struct Descendant: Base {
  Descendant(int a, char const* b):
    Base(a, b)
  { }
};

Now consider the following code using those types. Function foo expects to receive an array of Base. However, main defines an array of Descendant and passes that to foo instead.

void foo(Base* array, unsigned len)
{
  /* <access contents of array> */
}

int main()
{
  Descendant main_array[] = {
    Descendant(0, "zero"),
    Descendant(1, "one")
  };
  foo(main_array, 2);
  return 0;
}

Is behavior of this program defined? Does the answer depend on the body of foo, like whether it writes to the array or only reads from it?

If sizeof(Derived) is unequal to sizeof(Base), then behavior is undefined according to the answers to a previous question about a base pointer to an array of derived objects. Is there any chance the objects in this question will have differing sizes, though?

Community
  • 1
  • 1
Rob Kennedy
  • 161,384
  • 21
  • 275
  • 467
  • For whoever you may trust' sake, this question was asked in my test yesterday. I was fully lost then, and I'm still fully lost about this question now. (The question addressed the assignment of a vector of inherited classes to a vector of base classes in java. It was said there would be a break on the soundness of the operations after the assignment. The assignemnt: `class Mammal ...; class Dog extends Mammal ...; Dog x[] = new Dog[5]; Mammal y[] = x;`). – Rubens Nov 07 '13 at 18:46
  • @Rubens Java is a very different matter from C++ though. For starters, concerns about the size of the elements (as in the last paragraph of this question) don't apply because all classes are reference types. The soundness problem is visible in code like `y[0] = new Cat; Dog fido = x[0];` would result in the dog `fido` actually being a `Cat`. (This does not actually work, the `y[0] = new Cat` part throws an exception at run time.) *This* problem does apply to C++, but let's assume we're not doing that. –  Nov 07 '13 at 18:52
  • I think [this](http://stackoverflow.com/questions/17845128/struct-alignment-c-c]) question gets us some of the way in showing this is OK. – user786653 Nov 07 '13 at 19:34
  • Is it safe? No. Is it likely to work anyway? Yes. – Mark Ransom Nov 07 '13 at 19:51
  • Care to expand on that, @Mark? What, specifically, makes it unsafe? – Rob Kennedy Nov 07 '13 at 19:53
  • 1
    Sorry, if I was able to expand on it I would have left a complete answer. Generally anything that isn't guaranteed by the standard can be considered unsafe. I don't know enough about the standard to state authoritatively that it's not guaranteed, but since it's trivial to construct cases where it's going to blow up badly I can't see them making an exception for a corner case. I'm happy to be proven wrong though. – Mark Ransom Nov 07 '13 at 20:16
  • @MarkRansom: What's an example of where this will blow up? I can't seem to find any definitive answers/citations myself, but I think the example presented in the OP is well-defined (and `static_assert(sizeof(Descendant)==sizeof(Base),"Never fires")`). – user786653 Nov 07 '13 at 21:29
  • @user786653, the "corner case" I was referring to is when there are no added members in the derived class (making the assertion true). I'm not even sure the standard guarantees the derived class will be the same size as the base, although I can't think of an example where it wouldn't. – Mark Ransom Nov 07 '13 at 22:25

5 Answers5

1

Is behavior of this program defined? Does the answer depend on the body of foo, like whether it writes to the array or only reads from it?

I'm gonna hazard an answer saying that the program is well defined (as long as foo is) even if it is written in another language (e.g. C).

If sizeof(Derived) is unequal to sizeof(Base), then behavior is undefined according to the answers to a previous question about a base pointer to an array of derived objects. Is there any chance the objects in this question will have differing sizes, though?

I don't think so. According to my reading of the standard(*) §9.2 clause 17

Two standard-layout struct (Clause 9) types are layout-compatible if they have the same number of non-static data members and corresponding non-static data members (in declaration order) have layout-compatible types (3.9).

§9 clauses 7 through 9 detail the requirements for layout-compability:

7 A standard-layout class is a class that:

  • has no non-static data members of type non-standard-layout class (or array of such types) or reference,

  • has no virtual functions (10.3) and no virtual base classes (10.1),

  • has the same access control (Clause 11) for all non-static data members,

  • has no non-standard-layout base classes,

  • either has no non-static data members in the most derived class and at most one base class with non-static data members, or has no base classes with non-static data members, and

  • has no base classes of the same type as the first non-static data member.

8 A standard-layout struct is a standard-layout class defined with the class-key struct or the class-key class. A standard-layout union is a standard-layout class defined with the class-key union.

9 [ Note: Standard-layout classes are useful for communicating with code written in other programming languages. Their layout is specified in 9.2. — end note ]

Note especially the last clause (combined with §3.9) - according to my reading this is guaranteeing that as long as you're not adding too much "C++ stuff" (virtual functions etc. and thus violating the standard-layout requirement) your structs/classes will behave as C structs with added syntactical sugar.

Would anyone have doubted the legality if Base didn't have a constructor? I don't think so as that pattern (deriving from a C structure adding a constructor/helper functions) is idiomatic.

I'm open to the possibility that I'm wrong and welcome additions/corrections.

(*) I'm actually looking at N3290 here, but the actual standard should be close enough.

dyp
  • 38,334
  • 13
  • 112
  • 177
user786653
  • 29,780
  • 4
  • 43
  • 53
  • Thank you. Your reasoning appears sound. It hadn't occurred to me when I asked that the specific contents of `Base` might affect things, but it sounds like giving it virtual methods, or non-public visibility sections, or a `std::string` instead of `char*`, could all invalidate this code, even though from a practical standpoint, I don't think we should expect any of those things to *actually* cause the layouts of the two classes to differ. – Rob Kennedy Nov 09 '13 at 05:34
  • AFAIK, the layout of standard-layout classes is *not* exactly specified, namely the padding between the non-static data members. See the last section in 9.2 – dyp Nov 12 '13 at 18:09
0

BEWARE! While this is almost certainly true in your compiler, this is not guaranteed by the standard to work.

At least add if (sizeof(Derived) != sizeof(Base)) logAndAbort("size mismatch between Derived and Base"); check.

In case you were wondering, the compilers for which this is safe are one to one in which the size doesn't change. There was something left behind in the standard that allows derived classes to be non-contiguous with base classes. In all cases where this happens, the size must grow (for obvious reasons).

Joshua
  • 40,822
  • 8
  • 72
  • 132
0

If you declare an array of pointers to Base, then the code will run correctly. As a bonus, the new foo() will be safe to use with some future subclass of Base that has new data structures.

void foo(Base **array, unsigned len)
{
    // Example code
    for(unsigned i = 0; i < len; ++i)
    {
        Base *x = array[i];
        std::cout << x->a << x->b;
    }
}

void do_something()
{
    Base *data[2];
    data[0] = new Base(1, "a");
    data[2] = new Descendent(2, "b");

    foo(data, 2);

    delete data[0];
    delete data[1];
}
Ken A
  • 371
  • 2
  • 4
  • 1
    Welcome to Stack Overflow. Although you have made true statements, you haven't made any statements answering the question I asked. Thank you for taking the time, though. – Rob Kennedy Nov 07 '13 at 22:28
  • I was paraphrasing from here. http://www.parashift.com/c++-faq/array-derived-vs-base.html – Ken A Nov 09 '13 at 13:14
0

Adding new data to the Descendent class will break Descendent[]'s interchangeability with Base[]. In order for some function to pretend an array of larger structures is an array of smaller but otherwise compatible structures, a new array would have to be prepared in which the extra bytes are sliced off, in which case it is impossible to define the behavior of the system. What happens if some pointers are sliced off? What happens if the state of these objects is supposed to change as part of the called procedure, and the actual objects to which they refer are not the originals?

Otherwise, if no slicing occurs and a Base* to the Derived[] was ++ed, sizeof(Base) would be added to its binary value, and it would no longer point to a Base*. There is obviously no way to define the behavior of the system in that case either.

Knowing that, using this idiom is NOT safe, even if the standard and the president and God define it as working. Any addition to Descendent breaks your code. Even if you add an assertion, there will be functions whose legitimacy depends on Base[] being interchangeable with Descendent[]. Whoever maintains your code will have to hunt down each of these cases and come up with an appropriate workaround. Factoring your program around this idiom to avoid these problems will probably not be worth the convenience.

sqykly
  • 1,586
  • 10
  • 16
  • I know adding members to `Descendant` will break the code, but that's not what I asked. As I said, that scenario is already covered in the other question I linked to. If you're saying that a compiler could decide to make the classes' sizes unequal *as they're defined in my question*, then please say so, and explain how you know that. Otherwise, I'm not sure how what you've written is meant to answer what I asked. Please clarify. – Rob Kennedy Nov 09 '13 at 05:14
  • @RobKennedy You asked if passing a `Descendant[]` with no non-static members or virtual methods etc etc to a function expecting `Base[]` is *safe*. Sizes and a compiler's liberties with the binary representation of your class have nothing to do with safe. It's a maintainability hazard either way. That's what I meant to get across in paragraph 3. Shall I rephrase it, or do you still consider this uninformative? – sqykly Nov 09 '13 at 07:12
-1

While this particular example is safe on all modern platforms and compilers I am familiar with, it is not safe in general and it is an example of a bad code.

  1. It cannot work on compilers/platforms where sizeof(Base) != sizeof(Descendant)
  2. It is unsafe because someday someone in your project will add a new non-static member to the Descendant class or will make the Base class virtual.

UPD. Both Base and Descendant are standard layout types. So it is a requirement of the standard that a pointer to Descendant can be correctly reinterpret_cast to a pointer to Base, that means no padding in front of the structure is allowed. But there is no any requirement in C++ standard for padding at the end of a structure, so it is compiler-dependent. There is also the standard proposal to explicitly mark this behavior as undefined. http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1504

Alan Milton
  • 374
  • 4
  • 13
  • Do you have any evidence more authoritative than your personal experience? Under what circumstances would the sizes of the two classes be unequal? What in the standard allows for that? The answer from User786653 appears to argue that it's not possible. – Rob Kennedy Nov 08 '13 at 23:24
  • @RobKennedy User786653 demonstrates a too liberal interpretation of §9.2 clause 17 "same number of non-static data members". In §1.8 "data member", "complete object" and "base class subobject" are used separately. §9.2 defines how data members are declared and I could not find any case in the standard that would mark data members of the base class the data members of the derived class. New version of 17 is different and uses term "common initial sequence". Neither its definition, nor any of the examples imply its applicability to inheritance. – Alan Milton Jun 30 '17 at 16:42