18

I've been reading through Clang source code and discovered something interesting about the ARM C++ ABI that I can't seem to understand the justification for. From the an online version of the ARM ABI documentation:

This ABI requires C1 and C2 constructors to return this (instead of being void functions) so that a C3 constructor can tail call the C1 constructor and the C1 constructor can tail call C2.

(and similarly for non-virtual destructors)

I'm not sure what C1, C2, and C3 reference here...this section is meant to be a modification of §3.1.5 from the generic (i.e. Itanium) ABI, but that section (at least in this online verison) simply states:

Constructors return void results.

Anyway, I really can't figure out what the purpose of this is: how does making a constructor return this allow tail call optimization, and in what circumstances?

As far I can tell, the only time a constructor could tail call another with the same this return value would be the case of a derived class with a single base class, a trivial constructor body, no members with non-trivial constructors, and no virtual table pointer. In fact, it seems like it would actually be easier, not harder, to optimize with a tail call with a void return, because then the restriction of a single base class could be eliminated (in the multiple base class case, the this pointer returned from the last called constructor will not be the this pointer of the derived object).

What am I missing here? Is there something about the ARM calling convention that makes the this return necessary?

Stephen Lin
  • 5,470
  • 26
  • 48
  • 2
    I suspect that C1, C2 and C3 refer to "complete object constructor", "base object constructor" and "complete object allocating constructor", about which you can read more [here](http://stackoverflow.com/questions/6921295/dual-emission-of-constructor-symbols) – Michael Mar 16 '13 at 17:32
  • @Michael ahh, THAT makes sense (knew those existed, but didn't know the mangled name convention...I assumed `C1`,`C2`, and `C3` referred to constructors in some hierarchy in some diagram that was missing). I still don't understand the destructor case though. – Stephen Lin Mar 16 '13 at 17:37
  • @Michael ok, thanks for the link, understand the destructor case too now, I think...if no one answers this authoritatively I'll answer it myself at some point – Stephen Lin Mar 16 '13 at 17:41
  • 1
    From an efficiency point of view, this makes a lot of sense. Constructors presumably use the same conventions as other functions: `this` is passed in `r1`, but not required to be preserved by the callee. Ensuring it's passed out in `r0` ensures that the callee doesn't need to stash a copy. – marko Mar 16 '13 at 18:45
  • @Marko yeah, that's the explicit reason given in the ABI for the `D1` and `D2` destructors (see [Michael's linked question](http://stackoverflow.com/questions/6921295/dual-emission-of-constructor-symbols) for definitions) to return `this` actually...constructors can benefit similarly although the main reason is to allow tail calls from `C3` constructors. anyway, it's pretty clear now given those definitions, I'll self-answer if no one else decides to. – Stephen Lin Mar 16 '13 at 18:50
  • @Marko you mean the "caller" doesn't have to stash a copy, right? – Stephen Lin Mar 16 '13 at 18:52
  • @StephenLin I did indeed. – marko Mar 16 '13 at 18:54
  • @Marko `this` should be passed in `r0` . – auselen Mar 16 '13 at 23:20

1 Answers1

12

Ok, helpful link from @Michael made this all clear...C1, C2, and C3 refer to the name-mangling of the "complete object constructor", "base object constructor", and "complete object allocating constructor", respectively, from the Itanium ABI:

  <ctor-dtor-name> ::= C1   # complete object constructor
                   ::= C2   # base object constructor
                   ::= C3   # complete object allocating constructor
                   ::= D0   # deleting destructor
                   ::= D1   # complete object destructor
                   ::= D2   # base object destructor

The C3/"complete object allocating constructor" is a version of the constructor that, rather than operating on already allocated storage passed to it via the this parameter, allocates memory internally (via operator new) and then calls the C1/"complete object constructor", which is the normal constructor used for the complete object case. Since the C3 constructor must return the this pointer to the newly allocated and constructed object, the C1 constructor must also return the this pointer in order for a tail call to be used.

The C2/"base object constructor" is the constructor called by derived classes when constructing a base class subobject; the semantics of C1 and C2 constructors differ in case of virtual inheritance and could be implemented differently for optimization purposes as well. In the case of virtual inheritance, a C1 constructor could be implemented with calls to virtual base class constructors followed by a tail call to a C2 constructor, so the latter should also return this if the former does.

The destructor case is slightly different but related. As per the ARM ABI:

Similarly, we require D2 and D1 to return this so that D0 need not save and restore this and D1 can tail call D2 (if there are no virtual bases). D0 is still a void function.

The D0/"deleting destructor" is used when deleting an object, it calls the D1/"complete object destructor" and calls operator delete with the this pointer afterwards to free the memory. Having the D1 destructor return this allows the D0 destructor to use its return value to call operator delete, rather than having to save it to another register or spill it to memory; similarly, the D2/"base object destructor" should return this as well.

The ARM ABI also adds:

We do not require thunks to virtual destructors to return this. Such a thunk would have to adjust the destructor’s result, preventing it from tail calling the destructor, and nullifying any possible saving.

Consequently, only non-virtual calls of D1 and D2 destructors can be relied on to return this.

If I understand this correctly, it means that this save-restore-elision optimization can only be used when D0 calls D1 statically (i.e. in the case of a non-virtual destructor).

Stephen Lin
  • 5,470
  • 26
  • 48