15

Yesterday, me and my colleague weren't sure why the language forbids this conversion

struct A { int x; };
struct B : virtual A { };

int A::*p = &A::x;
int B::*pb = p;

Not even a cast helps. Why does the Standard not support converting a base member pointer to a derived member pointer if the base member pointer is a virtual base class?

Relevant C++ standard reference:

A prvalue of type “pointer to member of B of type cv T”, where B is a class type, can be converted to a prvalue of type “pointer to member of D of type cv T”, where D is a derived class (Clause 10) of B. If B is an inaccessible (Clause 11), ambiguous (10.2), or virtual (10.1) base class of D, or a base class of a virtual base class of D, a program that necessitates this conversion is ill-formed.

Both function and data member pointers are affected.

DanielKO
  • 4,422
  • 19
  • 29
Johannes Schaub - litb
  • 496,577
  • 130
  • 894
  • 1,212
  • 1
    Great question! Look forward to hearing the answer. – SkyVar Mar 13 '14 at 11:11
  • 3
    Following discussion with TemplateRex, could this question be simplified to "why can't I do `int B::*pb = &B::x;`? It's not just that you can't convert `p`: you can't have a pointer-to-member to a member in a virtual base at all. – Steve Jessop Mar 13 '14 at 12:27
  • @steve my code is doing the same as yours. Just that it uses a temporary variable to add clarity. Your code is attempting to do the conversion aswell. – Johannes Schaub - litb Mar 13 '14 at 18:33
  • @steve it is not correct that you cant have such a member pointer: "p" in my code is such a member pointer. – Johannes Schaub - litb Mar 13 '14 at 18:34
  • 1
    @JohannesSchaub-litb: `p` has type `int A::*`. The pointer itself doesn't need to "know" whether `A` is a virtual base of anything, since it is only ever dereferened by code that can figure out the `A` base class sub-object address before applying it. In order to have a pointer of type `int B::*` that referred to `x`, the pointer value would have to indicate that the required member is in `A`. That is the distinction I mean to draw by saying that you cannot have a pointer-to-member to a member in a base class. – Steve Jessop Mar 14 '14 at 23:03
  • @JohannesSchaub-litb: but you're right, my example code does obscure my point because `&B::x;` has type `int A::*`. – Steve Jessop Mar 14 '14 at 23:12
  • @SteveJessop I considered to word the question like that, but am in the opinion that it wouldn't be as clear. Something like "Can a member pointer C::* exist whose value refers to a member of a virtual base class" would not encapsulate the original issue my colleague and me had (which was the conversion). Hence I stated it using the conversion. – Johannes Schaub - litb Mar 14 '14 at 23:22
  • 1
    @JohannesSchaub-litb: OK, if that's how you see it :-). It's just that after the discussion with TemplateRex I came to the conclusion that the answer to your question, "why can't I do this conversion?" is "because the thing you're trying to convert to doesn't exist". Which immediately raises the new question, "why doesn't it exist?"! – Steve Jessop Mar 14 '14 at 23:29
  • @SteveJessop well the answer to why it doesn't exist would be the answer to my question. Saying that it is because the type doesn't exist isn't a real answer. Alternatively, if I stated the question "why can't I have a value of this type?" you could have said "because there is no conversion from the value &A::x up to B::*" as an "answer"-in-disguise – Johannes Schaub - litb Mar 14 '14 at 23:39

2 Answers2

8

Lippman's "Inside the C++ Object model" has a discussion about this:

[there] is the need to make the virtual base class location within each derived class object available at runtime. For example, in the following program fragment:

class X { public: int i; }; 
class A : public virtual X { public: int j; }; 
class B : public virtual X { public: double d; }; 
class C : public A, public B { public: int k; }; 
// cannot resolve location of pa->X::i at compile-time 
void foo( const A* pa ) { pa->i = 1024; } 

main() { 
 foo( new A ); 
 foo( new C ); 
 // ... 
} 

the compiler cannot fix the physical offset of X::i accessed through pa within foo(), since the actual type of pa can vary with each of foo()'s invocations. Rather, the compiler must transform the code doing the access so that the resolution of X::i can be delayed until runtime.

Essentially, the presence of a virtual base class invalidates bitwise copy semantics.

TemplateRex
  • 69,038
  • 19
  • 164
  • 304
  • But aren't class members ordered by inheritance in the memory? eg. `[X members][A or B members][C members]`. In such case trying to access X::sth (theoretically) should be deterministic regardless of what the object actually is. – Spook Mar 13 '14 at 11:29
  • @Spook the point is that the compiler needs flexibility because `A: virtual X` could be combined with `B : virtual X` into a `C : A, B`. Short of whole-program analysis, a fixed layout will run afoul very soon. Lippman has a quote: "Virtual base class support wanders off into the Byzantine". – TemplateRex Mar 13 '14 at 11:31
  • Why can't the compiler do the same "lets get the precise offset at runtime" with the member pointer? – Johannes Schaub - litb Mar 13 '14 at 11:32
  • @JohannesSchaub-litb I am not sure, perhaps the combinations of class taking `virtual` base classes, with `virtual` functions present in each are just too complicated to work out subobject-layouts reliably. At least compiler vendors thought so when the Standard was set into stone (the Lippman book is from 1996). – TemplateRex Mar 13 '14 at 11:36
  • @JohannesSchaub-litb btw, maybe `dynamic_cast` could be made to work with a Sufficiently Smart Compiler? It would not be free, because virtual base classes can add extra levels of indirection to trace the offsets. – TemplateRex Mar 13 '14 at 11:39
  • I agree with Johannes, it "feels like" some kind of cast should be able to do this. The information encapsulated in the value of `p` is, "offset 0". The information encapsulated in the type of `p` is "apply the offset from the address of an `A`". So there should be some means of taking those two facts and turning them into an `int B::*` that means, "the data member at offset 0 of the A base-class subobject, wherever that may be". Which I would think could then be dereferenced using the same virtual mechanism that dereferences `ptr->x` where `ptr` is a `B*`. – Steve Jessop Mar 13 '14 at 12:14
  • @SteveJessop I think the difficulty is that `p->mem1` and `p->mem2` can resolve to different offsets if `mem1` and `mem2` are inherited from different sub-objects. This means that, yes, you can resolve them at runtime, but, no, there is no cast to a single pointer *without knowing which member you want to access*. – TemplateRex Mar 13 '14 at 12:18
  • @TemplateRex: ah, gotcha. So the issue is that pointers-to-member are not required to be sufficiently complex to "record" which virtual base the member belongs to, whereas `ptr->x` knows that statically from the type of `ptr`. So you can't do the conversion for the same reason you can't do `int B::*pb = &B::a;`. – Steve Jessop Mar 13 '14 at 12:24
  • @SteveJessop I don't think pointers-to-member *could be* sufficiently complex to record whether they belong to the most-derived virtual base , since that is determined by their open set of subclasses. That is what's making the casting impossible. – TemplateRex Mar 13 '14 at 12:27
  • @TemplateRex: well, I suppose it would require some kind of type identifier under the covers. In principle I think the `int B::*` could contain some identifier to represent `A`(perhaps only in the context of `B`, not necessarily a global type id), and the vtable of `B` and each of its derived classes could contain a means of determining the `A` base class subobject pointer using that identifier. Which I'm prepared to accept is one step too far to be part of the language. – Steve Jessop Mar 13 '14 at 12:31
  • @SteveJessop but suppose a Sufficiently Advanced Compiler would do that for a given hierarchy. Then add one diamond level extra, but with the order of derivation reversed. The meaning of "most-derived" (which determines the offsets into the various sub-objects) would then be different and casting two-levels down to the already casted pointer would behave inconsistenly. – TemplateRex Mar 13 '14 at 12:36
2

Short answer:

I believe a compiler could make conversion from Base::* to Derived::* possible even when Derived derives virtually from Base. For this to work a pointer to member would need to record more than just the offset. It would also need to record the type of the original pointer through some type-erasure mechanism.

So my speculation is that the committee thought that this would be too much for a feature that is rarely used. In addition, something similar can be achieved with a pure library feature. (See the long answer.)

Long answer:

I hope my argument is not flawed in some corner case but here we go.

Essentially a pointer to member records the offset of the member with respect to the beginning of the class. Consider:

struct A { int x; };
struct B : virtual A { int y; };
struct C : B { int z; };

void print_offset(const B& obj) {
  std::cout << (char*) &obj.x - (char*) &obj << '\n';
}

print_offset(B{});
print_offset(C{});

On my platform the output is 12 and 16. This shows that the offset of a with respect to obj's address depends on obj's dynamic type: 12 if the dynamic type is B and 16 if it's C.

Now consider the OP's example:

int A::*p = &A::x;
int B::*pb = p;

As we saw, for an object of static type B, the offset depends on its dynamic type and in the two lines above no object of type B is used so there's no dynamic type to get the offset from.

However, to dereference a pointer to member an object is required. Couldn't a compiler take the object used at that time to get the correct offset? Or, in other words, could the offset computation be delayed until the time we evaluate obj.*pb (where obj is of static type B)?

It seems to me that this is possible. It's enough to cast obj to A& and use the offset recorded in pb (which it read from p) to get a reference to obj.x. For this to work pb must "remember" that it was initialized from an int A::*.

Here is a draft of template class ptr_to_member that implements this strategy. The specialization ptr_to_member<T, U> is supposed to work similarly to T U::*. (Notice this is just a draft that can be improved in different ways.)

template <typename Member, typename Object>
class ptr_to_member {

  Member Object::* p_;
  Member& (ptr_to_member::*dereference_)(Object&) const;

  template <typename Base>
  Member& do_dereference(Object& obj) const {
      auto& base = static_cast<Base&>(obj);
      auto  p    = reinterpret_cast<Member Base::*>(p_);
      return base.*p;
  }

public:

  ptr_to_member(Member Object::*p) :
    p_(p),
    dereference_(&ptr_to_member::do_dereference<Object>) {
  }

  template <typename M, typename O>
  friend class ptr_to_member;

  template <typename Base>
  ptr_to_member(const ptr_to_member<Member, Base>& p) :
    p_(reinterpret_cast<Member Object::*>(p.p_)),
    dereference_(&ptr_to_member::do_dereference<Base>) {
  }

  // Unfortunately, we can't overload operator .* so we provide this method...
  Member& dereference(Object& obj) const {
    return (this->*dereference_)(obj);
  }

  // ...and this one
  const Member& dereference(const Object& obj) const {
    return dereference(const_cast<Object&>(obj));
  }
};

Here is how it should be used:

A a;
ptr_to_member<int, A> pa = &A::x; // int A::* pa = &::x
pa.dereference(a) = 42;           // a.*pa = 42;
assert(a.x == 42);

B b;
ptr_to_member<int, B> pb = pa;   // int B::* pb = pa;
pb.dereference(b) = 43;          // b*.pb = 43;
assert(b.x == 43);

C c;
ptr_to_member<int, B> pc = pa;   // int B::* pc = pa;
pc.dereference(c) = 44;          // c.*pd = 44;
assert(c.x == 44);

Unfortunately, ptr_to_member alone doesn't solve the issue raised by Steve Jessop:

Following discussion with TemplateRex, could this question be simplified to "why can't I do int B::*pb = &B::x;? It's not just that you can't convert p: you can't have a pointer-to-member to a member in a virtual base at all.

The reason is that the expression &B::x is supposed to record only the offset of x from the beginning of B which is unkown as we have seen. To make this work, after realising that B::x is actually a member of the virtual base A, the compiler would need to create something similar to ptr_to_member<int, B> from &A::X which "remembers" the A seen at construction time and records the offset of x from the beginning of A.

Community
  • 1
  • 1
Cassio Neri
  • 19,583
  • 7
  • 46
  • 68
  • Nice. Is the `reinterpret_cast` legitimate? I don't think it fundamentally matters if it isn't, since you could come up with another means of type erasure if needed. – Steve Jessop Mar 14 '14 at 23:26
  • @SteveJessop I think it is: In the constructor it casts from `Member Base::*` to `Member Object::*`. The result is stored in `p_`. `dereference_` (the only user of `p_`) is set to `do_dereference` which casts `p_` back to the original type. AFAIK this is OK by 5.2.10/9 second bullet point: "converting an rvalue of type “pointer to data member of X of type T1” to the type “pointer to data member of Y of type T2” (where the alignment requirements of T2 are no stricter than those of T1) and back to its original type yields the original pointer to member value." Here `T1 = T2 = Member`. – Cassio Neri Mar 15 '14 at 15:11