22

I know that C++ doesn't support covariance for containers elements, as in Java or C#. So the following code probably is undefined behavior:

#include <vector>
struct A {};
struct B : A {};
std::vector<B*> test;
std::vector<A*>* foo = reinterpret_cast<std::vector<A*>*>(&test);

Not surprisingly, I received downvotes when suggesting this a solution to another question.

But what part of the C++ standard exactly tells me that this will result in undefined behavior? It's guaranteed that both std::vector<A*> and std::vector<B*> store their pointers in a continguous block of memory. It's also guaranteed that sizeof(A*) == sizeof(B*). Finally, A* a = new B is perfectly legal.

So what bad spirits in the standard did I conjure (except style)?

Community
  • 1
  • 1
Daniel Gehriger
  • 7,339
  • 2
  • 34
  • 55
  • 1
    The use of reinterpret_cast<>() after that point nothing is defined. It may work but your list of conditions is awfully short. I would add another couple of pre-conditions. sizeof(A) == sizeof(B); Neither A or B may contain any type of virtual function. Neither A nor B nor any descendant placed in the array can use multiple inheritance. – Martin York Jan 26 '11 at 17:28
  • 7
    The non C++ specific answer is that it's not typesafe. If you add an `A` into foo you have test in invalid state since it guarantees that all elements are of type `B`. And C# doesn't support this either. C# only support it for generic parameters that are used in safe ways(either input or output only) and only on interfaces and delegates. Java supports it because it adds runtime checks and internally works on the object base-class. – CodesInChaos Jan 26 '11 at 17:30
  • This question looks similar to http://stackoverflow.com/questions/842387/how-do-i-dynamically-cast-between-vectors-of-pointers – Nekuromento Jan 26 '11 at 17:36
  • Just a followup question without trying to hijack someone else's: would a `std::copy` provide the behavior the OP is after? – rubenvb Jan 26 '11 at 17:39
  • @rubenvb: You would need `std::transform` to convert a `vector` to a `vector` (since there is no implicit `Base -> Derived` conversion, for obvious reasons), and that then requires a copy be made of the `vector`, which the OP was trying to avoid. – James McNellis Jan 26 '11 at 17:43
  • @Code: [related answer](http://stackoverflow.com/questions/4229886/why-a-b-doesnt-make-lista-listb-wouldnt-that-remove-need-for-wildcards/4233050#4233050) to the type-safety problem. – fredoverflow Jan 26 '11 at 17:50
  • As @CodeInChaos, it is not a C++ problem, but a problem with your code. Neither Java nor C# allow that code for precisely the same reason. Conceptually the code is wrong, even if by being very careful with what you do you can make it work. – David Rodríguez - dribeas Jan 26 '11 at 18:18
  • @David are you sure Java doesn't allow it? I'm not a Java developer, but from what I read about how Java generics work it should be possible there. – CodesInChaos Jan 26 '11 at 18:38
  • 1
    @CodeInChaos: What made you think so? The whole idea is flawed and would break the type system. A simple test is present [here](http://ideone.com/37GaS). Generics allows for co-variance and contra-variance in function arguments, depending on what you want to do, like in `void append( Vector super Derived> v ) { v.add( new Derived() ); }` or `void extract( Vector extends Base> v ) { Base b = v.get(0); }`, but it won't allow the conversion of the references. – David Rodríguez - dribeas Jan 26 '11 at 19:15
  • 1
    The reason that is allowed in functions is that generics are just a compile time type check and the generic types are *erased* from the binary. When you use the covariant/contravariant arguments in functions the compiler can check that the operations inside the function don't break the requirements of the interface. On the calling side it can check the same requirement and then pass the reference that will always be to a non-generic `Vector` (contains `Object`). On the other hand, if the conversion was allowed, then you would be able to add a `Base` objects to a container of `Derived`. – David Rodríguez - dribeas Jan 26 '11 at 19:18
  • I think the recomended way yo do this is to write a `dynamic_vector_cast` where you make sure (at compile time) that the pointers (value of the container) can be dynamic cast one into the other. – alfC Feb 02 '15 at 01:26

4 Answers4

19

The rule violated here is documented in C++03 3.10/15 [basic.lval], which specifies what is referred to informally as the "strict aliasing rule"

If a program attempts to access the stored value of an object through an lvalue of other than one of the following types the behavior is undefined:

  • the dynamic type of the object,

  • a cv-qualified version of the dynamic type of the object,

  • a type that is the signed or unsigned type corresponding to the dynamic type of the object,

  • a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,

  • an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union),

  • a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,

  • a char or unsigned char type.

In short, given an object, you are only allowed to access that object via an expression that has one of the types in the list. For a class-type object that has no base classes, like std::vector<T>, basically you are limited to the types named in the first, second, and last bullets.

std::vector<Base*> and std::vector<Derived*> are entirely unrelated types and you can't use an object of type std::vector<Base*> as if it were a std::vector<Derived*>. The compiler could do all sorts of things if you violate this rule, including:

  • perform different optimizations on one than on the other, or

  • lay out the internal members of one differently, or

  • perform optimizations assuming that a std::vector<Base*>* can never refer to the same object as a std::vector<Derived*>*

  • use runtime checks to ensure that you aren't violating the strict aliasing rule

[It might also do none of these things and it might "work," but there's no guarantee that it will "work" and if you change compilers or compiler versions or compilation settings, it might all stop "working." I use the scare-quotes for a reason here. :-)]

Even if you just had a Base*[N] you could not use that array as if it were a Derived*[N] (though in that case, the use would probably be safer, where "safer" means "still undefined but less likely to get you into trouble).

Community
  • 1
  • 1
James McNellis
  • 348,265
  • 75
  • 913
  • 977
  • @James: thanks for coming over ;-) Wow, your last comment comes as a real surprise to me. I don't consider myself a C++ novice after 15 years of working with it, but I would never have thought that this (invalid behavior) even applies in the case of arrays! Thanks. – Daniel Gehriger Jan 26 '11 at 17:32
  • It (the `Base[N]` vs `Derived[N]`) will get you definitely into trouble if the sizes are different, e.g. if the derived class has more members. – etarion Jan 26 '11 at 17:33
  • @etarion: we are talking pointers here: eg, `f(Base a[]) { ... }` and passing it (through `reinterpret_cast`) an array of `Derived`. – Daniel Gehriger Jan 26 '11 at 17:35
  • @etarion: Oops. I meant `Base*[N]` vs. `Derived*[N]`, to match the OP's usage of `std::vector` and `std::vector`. – James McNellis Jan 26 '11 at 17:36
  • `f(Base a[])` has the same issue. `f(Base* a[])` is a different case, see James' edit. – etarion Jan 26 '11 at 17:38
4

You are invoking the bad spirit of reinterpret_cast<>.

Unless you really know what you do (I mean not proudly and not pedantically) reinterpret_cast is one of the gates of evil.

The only safe use I know of is managing classes and structures between C++ and C functions calls. There maybe some others however.

Stephane Rolland
  • 38,876
  • 35
  • 121
  • 169
  • 1
    Another reasonable use is with "fast math" approximations leveraging the representation of floating point numbers. The risk is essentially eliminated by the requirements of the approximations, e.g. the [fast inv sqrt](https://en.wikipedia.org/wiki/Fast_inverse_square_root) which (in-)famously works exclusively on `32-bit floating-point number[s] in IEEE 754 floating-point format`. TL;DR steer clear unless you're a professional "[nasal demon](https://en.wikipedia.org/wiki/Undefined_behavior)" wrangler. – John P Dec 26 '17 at 17:59
4

The general problem with covariance in containers is the following:

Let's say your cast would work and be legal (it isn't but let's assume it is for the following example):

#include <vector>
struct A {};
struct B : A { public: int Method(int x, int z); };
struct C : A { public: bool Method(char y); };
std::vector<B*> test;
std::vector<A*>* foo = reinterpret_cast<std::vector<A*>*>(&test);
foo->push_back(new C);
test[0]->Method(7, 99); // What should happen here???

So you have also reinterpret-casted a C* to a B*...

Actually I don't know how .NET and Java manage this (I think they throw an exception when trying to insert a C).

mmmmmmmm
  • 15,269
  • 2
  • 30
  • 55
  • Good point. Although I was aware of the fact that foo mustn't be modified. I should have declared it as `const`. – Daniel Gehriger Jan 26 '11 at 20:31
  • Java and C# prevent it. That's what the `List extends Base>` is about: you can't call `x.add(someBaseObject)` when `x` has that type. – Norswap Jan 27 '16 at 16:28
1

I think it'll be easier to show than tell:

struct A { int a; };

struct Stranger { int a; };

struct B: Stranger, A {};

int main(int argc, char* argv[])
{
  B someObject;
  B* b = &someObject;

  A* correct = b;
  A* incorrect = reinterpret_cast<A*>(b);

  assert(correct != incorrect); // troubling, isn't it ?

  return 0;
}

The (specific) issue showed here is that when doing a "proper" conversion, the compiler adds some pointer ajdustement depending on the memory layout of the objects. On a reinterpret_cast, no adjustement is performed.

I suppose you'll understand why the use of reinterpet_cast should normally be banned from the code...

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
  • Yes, in the case of multiple inheritance, this adds to the trouble. This isn't of issue with single inheritance, though. – Daniel Gehriger Jan 26 '11 at 20:33
  • 1
    @Daniel: except if you use `virtual` inheritance... except if your base class does not have virtual methods and the derived class has (on most implementations); it is not guaranteed by the standard thus it's a bug in the waiting. – Matthieu M. Jan 27 '11 at 07:09