11

Consider this smallest example (I could think of):

struct Bar;

struct Foo {
  Bar* const b;
  Foo(Bar* b) : b(b) {}
};

struct Bar {
  Foo* const f;
  Bar(Foo* f) : f(f) {}
};

struct Baz : Bar {
  Baz() : Bar(new Foo(this)) {}
};

When passing this to the ctor of Foo, nothing in the hierarchy of Baz has been created, but neither Foo nor Bar do anything problematic with the pointers they receive.

Now the question is, is it simply dangerous to give away this in this fashion or is undefined behaviour?

Question 2: What if Foo::Foo(Bar*) was a Foo::Foo(Bar&) with the same semantics? I would have to pass *this, but the deref operator wouldn't do anything in this case.

bitmask
  • 32,434
  • 14
  • 99
  • 159
  • +1 Good question, you obviously thought about this before posting. – wilhelmtell Nov 14 '11 at 21:07
  • Right now I will not hold it against you that you browse the web with Opera. But I do reserve the right to do so in the future should my lawyers feel it necessary. – wilhelmtell Nov 14 '11 at 21:09

4 Answers4

8

It's not UB. The object might not be initialised properly yet (so using it right away might not be possible), but storing the pointer for later is fine.

I would have to pass *this, but the deref operator wouldn't do anything in this case.

Of course it would, it would dereference the pointer. Remember that initialisation is not the same as allocation — when the constructor runs, object is already properly allocated (otherwise you wouldn't be able to initialise it) — i.e. it exists, but it's in indeterminate state until its constructor is done.

Cat Plus Plus
  • 125,936
  • 27
  • 200
  • 224
  • 1
    In `T* tp = ...; T& tr = *tp;` the `*` operator doesn't do anything, right? In the sense that it is not reflected in the generated code. – bitmask Nov 14 '11 at 19:13
  • 2
    @bitmask It **dereferences** the pointer. You can hardly call that nothing. On any decent compiler, generated code is going to be wildly different from whatever you wrote, so it's something you should probably be ignoring. – R. Martinho Fernandes Nov 14 '11 at 19:19
  • @R. Martinho Fernandes: Wait a second: `T&` is just a pointer to `T` with syntactic sugar, right? What am I missing? – bitmask Nov 14 '11 at 19:22
  • 1
    `T&` is a reference to `T`, `T*` is a pointer to `T`. They're not at all the same (no, really, if they were, why the heck would we want them both?). A reference is not a pointer. Major differences include the fact that a reference cannot be reseated and cannot be invalid. – R. Martinho Fernandes Nov 14 '11 at 19:23
  • @bitmask: No, `T&` is a reference. – Cat Plus Plus Nov 14 '11 at 19:23
  • I'm talking about a logical perspective. As far as I know there is no compiler that does not implement references with pointers. So for all that matters, a reference is a const-pointer. Of course, references are *syntactically* different from pointers but that is only syntactic sugar. So, all I'm saying is that the `*` in my above example is a mere ... "type conversion". – bitmask Nov 14 '11 at 19:33
  • 1
    @bitmask in some situatons compilers can implement references as *nothing at all* (think about inlining and local variables). But anyway, a reference is not like a const pointer. A reference is never null, something you can't say of a const pointer. A reference has **semantic value**, and that makes it much more than *syntactic* sugar. And I still don't see how a "type conversion" is *nothing*. – R. Martinho Fernandes Nov 14 '11 at 19:36
  • A reference is an alias, in this case to `*tp`, the standard doesn't prescribe the mechanism or that even that alias actually exists in generated code. It's also not a UB to derefence an uninitialized pointer if it's not used or in unevaluated context. Thus `T& tr = *tp;` doesn't introduce UB, until `tp` is used. – Gene Bushuyev Nov 14 '11 at 19:43
  • @bitmask: It doesn't matter how compilers implement it. References have different semantics than pointers. Especially when you consider rvalue references, which have no direct counterpart in pointer world. – Cat Plus Plus Nov 14 '11 at 19:43
  • @GeneBushuyev: That's actually a point of some contention; see the notes [here](http://stackoverflow.com/questions/2474018/when-does-invoking-a-member-function-on-a-null-instance-result-in-undefined-beha/2474021#2474021). – GManNickG Nov 14 '11 at 21:53
  • @GMan -- yes, I remember discussions long time ago, I was under impression it was long settled. In any case, the FDIS has this wording applicable to this discussion in 5.3.1/1: **Note: a pointer to an incomplete type (other than cv void) can be dereferenced. The lvalue thus obtained can be used in limited ways (to initialize a reference, for example);** – Gene Bushuyev Nov 14 '11 at 22:32
  • @Gene: Incomplete types are different from uninitialized (or null) pointer values. – GManNickG Nov 15 '11 at 00:46
  • @GMan, but here `this` is a pointer to an incomplete type: 9.2/2 "Within the class member-specification, the class is regarded as complete within function bodies, default arguments, exception-specifications, and brace-or-equal-initializers for non-static data members (including such things in nested classes). **Otherwise it is regarded as incomplete within its own class member-specification.**" – Gene Bushuyev Nov 15 '11 at 01:43
  • @GeneBushuyev: To be clear, I' talking about this claim: "It's also not a UB to derefence an uninitialized pointer if it's not used or in unevaluated context." If `tp` is uninitialized (`T* tp;`) or null (`T* tp = 0;`), it's undefined behavior. And again, the word 'incomplete' has nothing to do with *values*, only *types*. – GManNickG Nov 15 '11 at 01:48
  • @GMan -- I might be reading these comments superficially, and the discussion diverted to other issues. I was looking at the original `Baz() : Bar(new Foo(this))` assuming that OP was asking what would have happened if Foo initialized a reference instead of copying pointer. P.S. yes, types can be incomplete in one place and complete in another, and so the pointer. – Gene Bushuyev Nov 15 '11 at 02:06
  • @R. Martinho Fernandes, you are completely wrong. Reference can and will point to NULL. Only difference is difficulty finding out where is it pointing. int* pInt = NULL; int& refInt = *pInt; Compilers can implement almost anything as "nothing at all", including pointer access. Difference is purely psychological - most ppl are scared to death by word "address". – Agent_L Jan 15 '12 at 07:55
  • @R. Martinho Fernandes: also &(*(this)) is not dereferencing at all, it's just null op. – Agent_L Jan 15 '12 at 08:00
  • @Agent_L: I probably shouldn't even have bothered with correcting you. But I will anyway. **There are no null references in C++**. That's not psychological, that's a fact (*"(...) A reference shall be initialized to refer to a valid object or function [Note: in particular, a null reference cannot exist in a well-defined program (...)"* - from the C++ standard §8.3.2). For the example you gave a compiler is allowed to produce a program that outputs `"This is not a well-defined program. It helps to learn C++ before writing it."` – R. Martinho Fernandes Jan 15 '12 at 11:27
  • @Agent_L: And `&(*(this))` is dereferencing a pointer and applying operator& to it. I may or not be an identity operation (http://www.ideone.com/mMGzF), but it always involves a dereference. – R. Martinho Fernandes Jan 15 '12 at 11:28
  • @R. Martinho Fernandes "Thou shall not init references to NULL" - that is exactly psychological difference. I'm talking about technical ones here. (If all programs were "well-defined" we wouldn't have any bugs) – Agent_L Jan 16 '12 at 10:49
  • @R. Martinho Fernandes There are 2 definition of what "dereferencing" means. One is syntactic and you're right here. The other is functional : "accessing the thing to which the pointer points". Writing just (*this) is no access at all. You've got me, I should have wrote more elaborate example with & in function argument. But the point of "no dereference here" still stands. – Agent_L Jan 16 '12 at 10:49
5

The behavior is not undefined, nor is this necessarily dangerous.

Neither Foo nor Bar do anything problematic with the pointers they receive.

This is the key: you just have to be aware that the object to which the pointer points is not yet fully constructed.

What if Foo::Foo(Bar*) was a Foo::Foo(Bar&) with the same semantics?

There's really no difference between the two, so far as dangerousness or definedness is concerned.

James McNellis
  • 348,265
  • 75
  • 913
  • 977
  • My concern is: If somebody were to have the great idea to access e.g. a member of the *uncreated* instance, everything would break. So I was wondering what the standard has to say about giving away uninitialised objects (Note: My chief concern is that it is given away BEFORE the base classes are created, so **NO** ctor was executed yet!). The question with the references has this rationale: Nothing can go wrong if nobody derefs the pointer. – bitmask Nov 14 '11 at 19:10
  • 1
    I'm not sure it's proper to say this isn't dangerous. I whether it's dangerous or not depends on what the constructor does with the pointer. If the constructor just stores the pointer then all is well. What this situation reveals is that the author of `Foo` should state clearly what assumptions the constructor makes about the pointer it accepts. Then the caller proceeds from that. – wilhelmtell Nov 14 '11 at 21:05
  • 1
    @wilhelmtell: Well, that's why I didn't say it _isn't_ dangerous; I said it _isn't necessarily_ dangerous. ;-) Personally, I would tend to avoid doing this unless you control all of the code in question (i.e., don't pass pointers to partially constructed objects to code that you don't control). But, I could imagine a scenario where that would be required, in which case documentation is key. – James McNellis Nov 14 '11 at 21:10
5

This question is answered directly in C++ standard 3.8/5:

Before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any pointer that refers to the storage location where the object will be or was located may be used but only in limited ways. For an object under construction or destruction, see 12.7. Otherwise, such a pointer refers to allocated storage (3.7.4.2), and using the pointer as if the pointer were of type void*, is well-defined. Such a pointer may be dereferenced but the resulting lvalue may only be used in limited ways, as described below. The program has undefined behavior if:

  • the object will be or was of a class type with a non-trivial destructor and the pointer is used as the operand of a delete-expression,
  • the pointer is used to access a non-static data member or call a non-static member function of the object, or
  • the pointer is implicitly converted (4.10) to a pointer to a base class type, or
  • the pointer is used as the operand of a static_cast (5.2.9) (except when the conversion is to void*, or to void* and subsequently to char*, or unsigned char*), or
  • the pointer is used as the operand of a dynamic_cast (5.2.7).

Additionally, in 12.7/3:

To explicitly or implicitly convert a pointer (a glvalue) referring to an object of class X to a pointer (reference) to a direct or indirect base class B of X, the construction of X and the construction of all of its direct or indirect bases that directly or indirectly derive from B shall have started and the destruction of these classes shall not have completed, otherwise the conversion results in undefined behavior.

Gene Bushuyev
  • 5,512
  • 20
  • 19
4

That's a good question. If we read §3.8, the lifetime of an object with a non-trivial constructor only starts once the constructor has finished (“initialization is complete”). And a few paragraphs later, the standard delimits what we can and cannnot do with a pointer “before the lifetime of an object has started but after the storage which the object will occupy has been allocated” (and the this pointer in an initialization list would certainly seem to fit into that category, given the above definition): in particular

The program has undefined behavior if:

[...]

  • the pointer is implicitly converted to a pointer to a base class type, or

[...]

In your example, the type of the pointer in the parameter of the base class has base class type, so the this pointer of the derived class must be implicitly converted to it. Which is undefined behavior according to the above. But... in order to call the constructor of the base class, the compiler must implicitly convert the address to the type pointer to base class. So there must be some exceptions.

In practice, I've never known a compiler to fail in this case, except in cases where virtual inheritance was involved; I've definitely encountered errors with the following pattern:

class L;
class VB {};
class R : virtual VB { public: R( L* ); }
class L { L( char const* p ); };
class D : private virtual L, private virtual R { D(); }
D::D( char const* p ) : L( p ), R( this ) {}

Why the compiler had problems here, I don't know. It was able to correctly convert the pointer to pass it as the this pointer to the constructor of L, but it didn't do it correctly when passing it to R.

In this case, the work-around was to provide a wrapper class for L, with a member function which returned the pointer, e.g.:

class LW : public L
{
public:
    LW( char const* p ) : L( p ) {}
    L* getAddress() { return this; }
};

D::D( char const* p ) : L( p ), R( this->getAddress(); ) {}

The result of all this is that I can't give you a definite answer, because I'm not sure what the authors of the standard intended. On the other hand, I've actually seen cases where it doesn't work (and not that long ago).

James Kanze
  • 150,581
  • 18
  • 184
  • 329
  • Thanks for actually quoting the standard :) Your case involves virtual inheritance which might be the reason for the compiler bug. At any rate: Casting to a base class is already UB, which means my example is definitively UB. I wonder why it is undefined, even if no virtual inheritance is involved, doen't really make sense to me. – bitmask Nov 14 '11 at 20:01
  • @bitmask According to the standard, my problem wasn't due to a compiler bug; the code has undefined behavior. _Why_ it's undefined is another question: in some arbitrary function, I can understand the issues (although they're not insurmountable); in the constructor, the compiler has to be able to do the conversion in order to pass the pointer to the base class constructors, so why not allow it otherwise. – James Kanze Nov 15 '11 at 10:12