Why can a T* be passed in register, but a unique_ptr cannot?

Question

I'm watching Chandler Carruth's talk in CppCon 2019:

in it, he gives the example of how he was surprised by just how much overhead you incur by using an std::unique_ptr<int> over an int*; that segment starts about at time point 17:25.

You can have a look at the compilation results of his example pair-of-snippets (godbolt.org) - to witness that, indeed, it seems the compiler is not willing to pass the unique_ptr value - which in fact in the bottom line is just an address - inside a register, only in straight memory.

One of the points Mr. Carruth makes at around 27:00 is that the C++ ABI requires by-value parameters (some but not all; perhaps - non-primitive types? non-trivially-constructible types?) to be passed in-memory rather than within a register.

My questions:

Is this actually an ABI requirement on some platforms? (which?) Or maybe it's just some pessimization in certain scenarios?
Why is the ABI like that? That is, if the fields of a struct/class fit within registers, or even a single register - why should we not be able to pass it within that register?
Has the C++ standards committee discussed this point in recent years, or ever?

PS - So as not to leave this question with no code:

Plain pointer:

void bar(int* ptr) noexcept;
void baz(int* ptr) noexcept;

void foo(int* ptr) noexcept {
    if (*ptr > 42) {
        bar(ptr); 
        *ptr = 42; 
    }
    baz(ptr);
}

Unique pointer:

using std::unique_ptr;
void bar(int* ptr) noexcept;
void baz(unique_ptr<int> ptr) noexcept;

void foo(unique_ptr<int> ptr) noexcept {
    if (*ptr > 42) { 
        bar(ptr.get());
        *ptr = 42; 
    }
    baz(std::move(ptr));
}

I'm not sure what the ABI requirement is exactly, but it [does not ban putting structs in registers](https://godbolt.org/z/od4-wY) — harold, Oct 11 '19 at 10:25
If I had to guess I'd say it has to do with non-trivial member functions needing a `this` pointer that points at a valid location. `unique_ptr` has those. Spilling the register for that purpose would kinda negate the whole "pass in a register" optimization. — StoryTeller - Unslander Monica, Oct 11 '19 at 10:34
https://itanium-cxx-abi.github.io/cxx-abi/abi.html#calls. So this behavior required. Why? https://itanium-cxx-abi.github.io/cxx-abi/cxx-closed.html, search for the issue C-7. There is some explanation there, but it is not too detailed. But yeah, this behavior doesn't seem logical to me. These objects could be passed through stack normally. Pushing them to stack, and then passing the reference (just for "non-trivial" objects) seems a waste. — geza, Oct 11 '19 at 10:37
And, btw., Chandler is not too fair here. `unique_ptr` does more, it is not just "passing ownership". If we have an abstraction, which does **only** this, this would be free (it is not just about exception safety). — geza, Oct 11 '19 at 10:43
Yeah, perhaps. Btw., we had less issues, if the called function called the destructors, like MSVC does it (calling them at the end of the expression doesn't seem useful for me. But maybe I don't see something. Value parameters are not typical temporaries, which must be destroyed at the end of the full-expression). — geza, Oct 11 '19 at 11:30
It seems C++ is violating its own principles here, which is quite sad. I was 140% convinced any unique_ptr just disappears after compilation. After all it's just a deferred destructor call which is known at compile-time. — One Man Monkey Squad, Oct 11 '19 at 12:10
@OneManMonkeySquad Zero-cost in C++ rather means that the cost is the same as if you wrote it by hand. One uses `std::unique_ptr` to avoid leaks and the compiler generates that leak avoidance (destructor calls) code for you. — Maxim Egorushkin, Oct 11 '19 at 12:46
@MaximEgorushkin: If you had written it by hand, you would have put the pointer in a register and not on the stack. — einpoklum, Oct 11 '19 at 13:37
@OneManMonkeySquad: I suspect this is not a serious concern in practice. (The fact that nobody seems to have noticed for years supports this suspicion.) The increased cost here is small and only happens at function entry/exit. So this overhead will only matter if the function is both (a) trivial and (b) called a huge number of times. In that case, your profiler will tell you that you need to inline it... Which is the same as it ever was. — Nemo, Oct 14 '19 at 15:41
@MaximEgorushkin "_means that the cost is the same as if you wrote it by hand_" Exactly. What about a class w/ members `copy` and `destroy` instead of a copy ctor and a dtor, used such that the normal member functions are called (by the user) when the special member functions would be called (by the compiler)? If it's more efficient, this means C++ as it's meant to be used has overhead of C like code, which is bad. — curiousguy, Nov 16 '19 at 05:38
@Nemo The cost of dynamic allocation is non trivial so the overhead of a smart ptr class is "amortized". Still the issue is important. — curiousguy, Nov 16 '19 at 05:40
@curiousguy: It's not amortized at all. The heap allocation+deallocation happen once, but the unique_ptr can get passed around many times. My argument is not about heap allocation; it's about function entry, which is the only time these overheads appear. (Chandler's video starts by observing that unique_ptr within a block really does have zero overhead compared to a raw pointer. It is only at function call boundaries that there is a "problem".) — Nemo, Nov 16 '19 at 22:26
@Nemo "Nobody noticed for years" is a very weak point: nobody noticed that Java allowed the destruction of an object that was in use, both at the spec level and in practice with some optimizers, for years. And quite a few Java fan cared about it. — curiousguy, Nov 20 '19 at 18:52
@Nemo: this is why you compile with link-time optimization enabled, so this inlining can happen automatically even across compilation units (not just for code in `.h` files). Or you put all trivial function in header files. Compilers are good at inlining, you don't have to do it yourself. — Peter Cordes, Dec 08 '19 at 12:21

Maxim Egorushkin · Accepted Answer · 2020-06-19T19:06:20.413

Is this actually an ABI requirement, or maybe it's just some pessimization in certain scenarios?

One example is System V Application Binary Interface AMD64 Architecture Processor Supplement. This ABI is for 64-bit x86-compatible CPUs (Linux x86_64 architecure). It is followed on Solaris, Linux, FreeBSD, macOS, Windows Subsystem for Linux:

If a C++ object has either a non-trivial copy constructor or a non-trivial destructor, it is passed by invisible reference (the object is replaced in the parameter list by a pointer that has class INTEGER).

An object with either a non-trivial copy constructor or a non-trivial destructor cannot be passed by value because such objects must have well defined addresses. Similar issues apply when returning an object from a function.

Note, that only 2 general purpose registers can be used for passing 1 object with a trivial copy constructor and a trivial destructor, i.e. only values of objects with sizeof no greater than 16 can be passed in registers. See Calling conventions by Agner Fog for a detailed treatment of the calling conventions, in particular §7.1 Passing and returning objects. There are separate calling conventions for passing SIMD types in registers.

There are different ABIs for other CPU architectures.

There is also Itanium C++ ABI which most compilers comply with (apart from MSVC), which requires:

If the parameter type is non-trivial for the purposes of calls, the caller must allocate space for a temporary and pass that temporary by reference.

A type is considered non-trivial for the purposes of calls if:

it has a non-trivial copy constructor, move constructor, or destructor, or

all of its copy and move constructors are deleted.

This definition, as applied to class types, is intended to be the complement of the definition in [class.temporary]p3 of types for which an extra temporary is allowed when passing or returning a type. A type which is trivial for the purposes of the ABI will be passed and returned according to the rules of the base C ABI, e.g. in registers; often this has the effect of performing a trivial copy of the type.

Why is the ABI like that? That is, if the fields of a struct/class fit within registers, or even a single register - why should we not be able to pass it within that register?

It is an implementation detail, but when an exception is handled, during stack unwinding, the objects with automatic storage duration being destroyed must be addressable relative to the function stack frame because the registers have been clobbered by that time. Stack unwinding code needs objects' addresses to invoke their destructors but objects in registers do not have an address.

Pedantically, destructors operate on objects:

An object occupies a region of storage in its period of construction ([class.cdtor]), throughout its lifetime, and in its period of destruction.

and an object cannot exist in C++ if no addressable storage is allocated for it because object's identity is its address.

When an address of an object with a trivial copy constructor kept in registers is needed the compiler can just store the object into memory and obtain the address. If the copy constructor is non-trivial, on the other hand, the compiler cannot just store it into memory, it rather needs to call the copy constructor which takes a reference and hence requires the address of the object in the registers. The calling convention probably cannot depend whether the copy constructor was inlined in the callee or not.

Another way to think about this, is that for trivially copyable types the compiler transfers the value of an object in registers, from which an object can be recovered by plain memory stores if necessary. E.g.:

void f(long*);
void g(long a) { f(&a); }

on x86_64 with System V ABI compiles into:

g(long):                             // Argument a is in rdi.
        push    rax                  // Align stack, faster sub rsp, 8.
        mov     qword ptr [rsp], rdi // Store the value of a in rdi into the stack to create an object.
        mov     rdi, rsp             // Load the address of the object on the stack into rdi.
        call    f(long*)             // Call f with the address in rdi.
        pop     rax                  // Faster add rsp, 8.
        ret                          // The destructor of the stack object is trivial, no code to emit.

In his thought-provoking talk Chandler Carruth mentions that a breaking ABI change may be necessary (among other things) to implement the destructive move that could improve things. IMO, the ABI change could be non-breaking if the functions using the new ABI explicitly opt-in to have a new different linkage, e.g. declare them in extern "C++20" {} block (possibly, in a new inline namespace for migrating existing APIs). So that only the code compiled against the new function declarations with the new linkage can use the new ABI.

Note that ABI doesn't apply when the called function has been inlined. As well as with link-time code generation the compiler can inline functions defined in other translation units or use custom calling conventions.

Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/200872/discussion-on-answer-by-maxim-egorushkin-why-can-a-t-be-passed-in-register-but). — Samuel Liew, Oct 14 '19 at 22:57
It's not really linkage that would need to change, but either the calling convention or the class type. Something like `__declspec(register)` on the class should be sufficient. — user541686, May 23 '21 at 08:06

einpoklum · Answer 2 · 2019-10-14T14:58:55.320

8

With common ABIs, non-trivial destructor -> can't pass in registers

_{(An illustration of a point in @MaximEgorushkin's answer using @harold's example in a comment; corrected as per @Yakk's comment.)}

If you compile:

struct Foo { int bar; };
Foo test(Foo byval) { return byval; }

you get:

test(Foo):
        mov     eax, edi
        ret

i.e. the Foo object is passed to test in a register (edi) and also returned in a register (eax).

When the destructor is not trivial (like the std::unique_ptr example of OP's) - Common ABIs require placement on the stack. This is true even if the destructor does not use the object's address at all.

Thus even in the extreme case of a do-nothing destructor, if you compile:

struct Foo2 {
    int bar;
    ~Foo2() {  }
};

Foo2 test(Foo2 byval) { return byval; }

you get:

test(Foo2):
        mov     edx, DWORD PTR [rsi]
        mov     rax, rdi
        mov     DWORD PTR [rdi], edx
        ret

with useless loading and storing.

edited Oct 14 '19 at 14:58

answered Oct 11 '19 at 16:30

einpoklum

118,144
57
340
684

I am not convinced by this argument. The non-trivial destructor does nothing to prohibit the as-if rule. If the address is not observed, there is absolutely no reason why there needs to be one. So a conforming compiler could happily put it in a register, if doing so does not change the observable behavior (and current compilers will in fact do so [if the callers are known](https://godbolt.org/z/c56RAT)). – ComicSansMS Oct 12 '19 at 08:30
@ComicSansMS: Are you not convinced that this is _reasonable_, or that this is the standard? If it's the former - I share your frustration. – einpoklum Oct 12 '19 at 08:34
1

Unfortunately, it's the other way (I agree that some of this is already beyond reason). To be precise: I am not convinced that the reasons you provided would necessarily render any conceivable ABI that allowed passing the current `std::unique_ptr` in a register non-conformant. – ComicSansMS Oct 12 '19 at 08:54
@ComicSansMS Perhaps this makes sense: In the example, the body of the trivial destructor is available and hence the compiler *may* inline it. Even worse, in different translation units the compiler might sometimes inline it and sometimes not, hence you *must* be prepared for it sometimes to not be inlined, i.e., being treated as if the method body was not even available and hence triviality could not be determined. Hence you *must* use a calling convention that is compatible with a non-trivial desctructor – Hagen von Eitzen Oct 12 '19 at 15:03
@HagenvonEitzen: If the body is available in the definition of the class, and is known to be effectively-trivial, inlining shouldn't even matter. – einpoklum Oct 12 '19 at 15:06
@ComicSansMS: Agreed; the as-if rule itself doesn't rule out anything (so it allows for the possibility of whole-program optimization, and custom calling conventions depending on properties discovered by the optimizer). To explain your point another way: real-life compilers want to work for separately-compiled sources, and making the calling-convention rules depend on anything the optimizer can/can't prove is too brittle for most people. (Optimized code maybe couldn't call debug code, and you need the same optimization options with everything.) – Peter Cordes Oct 14 '19 at 02:26
So the real-life ABI design choices aren't restricted purely by the as-if rule. I've sometimes wondered whether it would be possible to design a sane / usable ABI where non-trivially-copyable objects could be passed/returned in registers. Or maybe a non-trivial destructor shouldn't block passing in registers: the receive side can always spill it if needed. (As long as it didn't also have a non-trivial constructor that would make it need an address in the sending side.) But with a non-trivial constructor, we need the optimizer to prove it's not address-dependent - possible but not nice. – Peter Cordes Oct 14 '19 at 02:29
4

"trivial destructor [CITATION NEEDED]" clearly false; if no code actually depends on the address, then as-if means the address need not exist _on the actual machine_. The address must exist *in the abstract machine*, but things in the abstract machine that have no impact on the actual machine are things *as if* is allowed to eliminate. – Yakk - Adam Nevraumont Oct 14 '19 at 13:56
@Yakk-AdamNevraumont: So, there is nothing in the standard per se which precludes passing a `struct Foo2` through registers? – einpoklum Oct 14 '19 at 14:18
4

@einpoklum There is nothing in the standard that states registers exist. The register keyword just states "you cannot take the address". There is only an abstract machine as far as the standard is concerned. "as if" means that any real machine implementation need only behave "as if" the abstract machine behaves, up to behavior undefined by the standard. Now, there are very challenging problems around having an object in an register, which everyone has talked about extensively. Also, calling conventions, which the standard also does not discuss, have practical needs. – Yakk - Adam Nevraumont Oct 14 '19 at 14:30
1

@einpoklum No, in that abstract machine, all things have addresses; but addresses are only observable in certain circumstances. The `register` keyword was intended to make it trivial for the physical machine to store something in a register by blocking things that practically make it harder to "have no address" in the physical machine. – Yakk - Adam Nevraumont Oct 14 '19 at 14:53
@Yakk-AdamNevraumont: I've edited my answer, but given what you've said - perhaps I should just delete it? – einpoklum Oct 14 '19 at 14:59
@HagenvonEitzen "_in different translation units the compiler might sometimes inline it and sometimes not, (...) i.e., being treated as if the method body was not even available_" What does that have to do w/ inlining the code = not generating a call sequence? Either the body of a function is defined inline or not; if it is, *the compiler gets to know what the body can do* (f.ex. it might be able to determine the function never throws, even if it doesn't have an exception spec). How would generating a call sequence negate that? – curiousguy Nov 16 '19 at 05:52
@Yakk-AdamNevraumont "_The register keyword just states "you cannot take the address"._" .. (C only) – curiousguy Nov 16 '19 at 05:55
@PeterCordes "_depend on anything the optimizer can/can't prove is too brittle for most people_" More than "too brittle", it's provably broken beyond repair: a non inline normal (non template) function `f()` is defined exactly once, in some TU "x.cc". So when compiling "x.cc" the body of `f()` is available, and not in any other TU; calls to `f()` would be compiled differently in "x.cc" and elsewhere (which might be OK is `f()` has multiple entry points); but functions involving calls to `f()`, and classes w/ such members could end up not following the same convention in "x.cc" and elsewhere! – curiousguy Nov 16 '19 at 06:54
This is also the case with a simple const static class member: only in the TU where it is defined is the initializer known so the code in that TU can assume the variable's value, other cannot. So non inline functions called by multiply defined "functions" (either inline functions or function templates) are not even necessary for that to happen. Note that such diff of information content between TU arises by respecting the ODR, not by breaking it. – curiousguy Nov 16 '19 at 06:58
"_there is nothing in the standard per se which precludes passing a struct Foo2 through registers_" The std is specified in term of object identity. Pure values, like 2 or `&x` don't have an identity, but all variables have one (incl. those of type `int` or `T*`). A pure value is a value, an object contains a value. An address comparison on distinct objects must return false (even if both are const and hold the same value). An address may be taken directly or via a reference bound directly: `int &r = x; int *p = &r;` Many local var don't have their address taken. – curiousguy Nov 16 '19 at 07:30
1

@curiousguy: An "address" could theoretically be in a space which includes registers and main memory. And then in-register data can have an address, which is just optimized away. – einpoklum Nov 16 '19 at 08:08
@einpoklum-reinstateMonica Not sure... I always go back to that annoying technical detail: pointers are trivial types, so all adresses must be representable in the bits of a real pointer. Also that fact make a lot of the std senseless and makes almost all compilers broken, so in fact it's a defect, ptr should not be described as trivial types, or a very elaborate definition of trivial should be invented that makes sense logically. See all my Q re: ptr – curiousguy Nov 16 '19 at 18:03
@curiousguy: If plain pointers can represent address in GPU and in CPU memory, as well as interaction with certain devices through MMIO - I see no reason why they couldn't also represent register locations. – einpoklum Nov 16 '19 at 19:00
2

@curiousguy: We're talking about an arbitrary implementation of the abstact machine. You could decide that a part of the address space is reserved for what's in the registers, and the memory starts at some non-zero address. – einpoklum Nov 16 '19 at 19:49
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/202481/discussion-between-curiousguy-and-einpoklum-reinstate-monica). – curiousguy Nov 16 '19 at 21:31

score 2 · Answer 3 · answered Oct 11 '19 at 22:01

2

Is this actually an ABI requirement on some platforms? (which?) Or maybe it's just some pessimization in certain scenarios?

If something is visible at the compliation unit boundry then whether it is defined implicitly or explicitly it becomes part of the ABI.

Why is the ABI like that?

The fundamental problem is that registers get saved and restored all the time as you move down and up the call stack. So it's not practical to have a reference or pointer to them.

In-lining and the optimizations that result from it is nice when it happens, but an ABI designer can't rely on it happening. They have to design the ABI assuming the worst case. I don't think programmers would be very happy with a compiler where the ABI changed depending on the optimization level.

A trivially copyable type can be passed in registers because the logical copy operation can be split into two parts. The parameters are copied to the registers used for passing parameters by the caller and then copied to the local variable by the callee. Whether the local variable has a memory location or not is thus only the concern of the callee.

A type where a copy or move constructor must be used on the other hand cannot have it's copy operation split up in this way, so it must be passed in memory.

Has the C++ standards committee discussed this point in recent years, or ever?

I have no idea if the standards bodies have considered this.

The obvious solution to me would be to add proper destructive moves (rather than the current half-way house of a "valid but otherwise unspecified state") to the langauge, then introduce a way to flag a type as allowing for "trivial destructive moves" even if it does not allow for trivial copies.

but such a solution WOULD require breaking the ABI of existing code to implement for existing types, which may bring a fair bit of resistance (though ABI breaks as a result of new C++ standard versions are not unprecedented, for example the std::string changes in C++11 resulted in an ABI break..

answered Oct 11 '19 at 22:01

plugwash

9,724
2
38
51

Can you elaborate on how proper destructive moves would allow for a unique_ptr to be passed in a register? Would that be because it would allow dropping the requirement for addressable storage? – einpoklum Oct 11 '19 at 22:13
Proper destructive moves would enable a concept of trivial destructive moves to be introduced. This would allow said trivial move to be split up by the ABI in the same way that trivial copies can be today. – plugwash Oct 11 '19 at 22:24
Though you would also want to add a rule that a compiler could implement a parameter pass as a regular move or copy followed by a "trivial destructive move" to ensure that it was always possible to pass in registers no matter where the parameter came from. – plugwash Oct 11 '19 at 22:26
Because the register size can hold a pointer, but a unique_ptr structure? What's sizeof(unique_ptr)? – Mel Viso Martinez Oct 16 '19 at 05:23
@MelVisoMartinez You may be confusing `unique_ptr` and `shared_ptr` semantics: `shared_ptr` lets you provide to the ctor 1) a ptr x to derived object U to be deleted with static type U w/ the expression `delete x;` (so you don't need a virtual dtor here) 2) or even a custom cleanup function. That means that runtime state is used inside the `shared_ptr` control block to encode that information. OTOH `unique_ptr` has no such functionality and does not encode deletion behavior in state; the only way to customize cleanup is to create another template instanciation (another class type). – curiousguy Nov 16 '19 at 06:13
@plugwash "_Proper destructive moves_" do you mean introducing a c+dtor concept in C++: a function that terminates a lifetime and starts another? That would be a radical change of the basic lifetime axioms. I suppose that such would not behave as an explicit dtor call, which is a runtime construct that doesn't end the scope of a variable, but an explicit scope end which prevents the implicit dtor call by the compiler. – curiousguy Nov 16 '19 at 06:19
Right, by a "proper destructive move" I do indeed mean something that ends one lifetime and starts another. There are many objects for which a copy is non-trivial but a destructive move is trivial. With the current semantics a move requires the source to be left in a "valid but otherwise unspecified" state, which in turn means a user-defined constructure which in turn makes it impractical to pass the parameter in a register. – plugwash Nov 16 '19 at 06:25
Exactly what semantics this should have is a good question, one way to do it would be to allow the compiler to make the regular move/copy to a temporary and then allow it to make a "true destructive move" to actually pass the parameter. This would allow an ABI to pass things like unique_ptrs in registers without radically changing the language and the regular copy/move would be on one side of the function barrier and so could be handled by the optimiser. – plugwash Nov 16 '19 at 06:27
"_with a compiler where the ABI changed depending on the optimization level_" It remains to be seen that linking TU compiled w/ diff codegen options is valid and respects the meaning of the ODR. – curiousguy Nov 16 '19 at 07:09
That interpretation of the ODR would basically mean that compilers had to ship multiple copies of the standard library for different optimization levels. – plugwash Nov 16 '19 at 08:53
Sadly I suspect the time to fix this is passed, even if features were added to allow unique_ptrs to be passed in registers I suspect that at least the linux implementations would not want the ABI break resulting from actually using those features. I suspect the half-assed approach to moves added in C++11 will be with us for a long time – plugwash Nov 16 '19 at 09:31
@plugwash Can you write more about your concept of "proper destructive move"? That sounds interesting (but probably off topic so let's find another place). – curiousguy Dec 08 '19 at 05:21
@MelVisoMartinez see https://stackoverflow.com/questions/13460395/how-can-stdunique-ptr-have-no-size-overhead – plugwash Jan 27 '20 at 21:39
Event that, this is implementation dependent. As long as I could remember, despite of some very basic types, no one standard enforces implementation sizes. I worked in c++ from his beginning and feels that my opinion couldn't help here. – Mel Viso Martinez Jan 27 '20 at 22:40

curiousguy · Answer 4 · 2019-12-08T03:59:38.240

First we need to go back to what it means to pass by value and by reference.

For languages like Java and SML, pass by value is straightforward (and there is no pass by reference), just as copying a variable value is, as all variables are just scalars and have builtin copy semantic: they are either what who count as arithmetic type in C++, or "references" (pointers with different name and syntax).

In C we have scalar and user defined types:

Scalars have a numeric or abstract value (pointers are not numbers, they have an abstract value) that is copied.
Aggregate types have all their possibly initialized members copied:
- for product types (arrays and structures): recursively, all members of structures and elements of arrays are copied (the C function syntax doesn't make it possible to pass arrays by value directly, only arrays members of a struct, but that's a detail).
- for sum types (unions): the value of the "active member" is preserved; obviously, member by member copy isn't in order as not all members can be initialized.

In C++ user defined types can have user defined copy semantic, which enable truly "object oriented" programming with objects with ownership of their resources and "deep copy" operations. In such case, a copy operation is really a call to a function that can almost do arbitrary operations.

For C structs compiled as C++, "copying" is still defined as calling the user defined copy operation (either constructor or assignment operator), which are implicitly generated by the compiler. It means that the semantic of a C/C++ common subset program is different in C and C++: in C a whole aggregate type is copied, in C++ an implicitly generated copy function is called to copy each member; the end result being that in either case each member is copied.

(There is an exception, I think, when a struct inside a union is copied.)

So for a class type, the only way (outside union copies) to make a new instance is via a constructor (even for those with trivial compiler generated constructors).

You can't take the address of an rvalue via unary operator & but that doesn't mean that there is no rvalue object; and an object, by definition, has an address; and that address is even represented by a syntax construct: an object of class type can only be created by a constructor, and it has a this pointer; but for trivial types, there is no user written constructor so there no place to put this until after the copy is constructed, and named.

For scalar type, the value of an object is the rvalue of the object, the pure mathematical value stored into the object.

For a class type, the only notion of a value of the object is another copy of the object, which can only be made by a copy constructor, a real function (although for trivial types that function is so specially trivial, these can sometimes be created without calling the constructor). That means that the value of object is the result of change of global program state by an execution. It doesn't access mathematically.

So pass by value really isn't a thing: it's pass by copy constructor call, which is less pretty. The copy constructor is expected to perform a sensible "copy" operation according to the proper semantic of the object type, respecting its internal invariants (which are abstract user properties, not intrinsic C++ properties).

Pass by value of a class object means:

create another instance
then make the called function act on that instance.

Note that the issue has nothing to do with whether the copy itself is an object with an address: all function parameters are objects and have an address (at the language semantic level).

The issue is whether:

the copy is a new object initialized with the pure mathematical value (true pure rvalue) of original object, as with scalars;
or the copy is the value of original object, as with classes.

In the case of a trivial class type, you can still define the member of member copy of the original, so you get to define the pure rvalue of the original because of triviality of the copy operations (copy constructor and assignment). Not so with arbitrary special user functions: a value of the original has to be a constructed copy.

Class objects must be constructed by the caller; a constructor formally has a this pointer but formalism isn't relevant here: all objects formally have an address but only those that actually get their address used in non purely local ways (unlike *&i = 1; which is purely local use of address) need to have a well defined address.

An object must absolutely by passed by address if it must appear to have an address in both these two separately compiled functions:

void callee(int &i) {
  something(&i);
}

void caller() {
  int i;
  callee(i);
  something(&i);
}

Here even if something(address) is a pure function or macro or whatever (like printf("%p",arg)) that can't store the address or communicate to another entity, we have the requirement to pass by address because the address must be well defined for a unique object int that has an unique identity.

We don't know if an external function will be "pure" in term of addresses passed to it.

Here the potential for a real use of the address in either a non trivial constructor or destructor on the caller side is probably the reason for taking the safe, simplistic route and give the object an identity in the caller and pass its address, as it makes sure that any non trivial use of its address in the constructor, after construction and in the destructor is consistent: this must appear to be the same over the object existence.

A non trivial constructor or destructor like any other function can use the this pointer in a way that requires consistency over its value even though some object with non trivial stuff might not:

struct file_handler { // don't use that class!
    file_handler () { this->fileno = -1; }
    file_handler (int f) { this->fileno = f; }
    file_handler (const file_handler& rhs) {
        if (this->fileno != -1)
            this->fileno = dup(rhs.fileno);
        else
            this->fileno = -1;
    }
    ~file_handler () {
        if (this->fileno != -1)
            close(this->fileno); 
    }
    file_handler &operator= (const file_handler& rhs);
};

Note that in that case, despite explicit use of a pointer (explicit syntax this->), the object identity is irrelevant: the compiler could well use bitwise copy the object around to move it and to do "copy elision". This is based on the level of "purity" of the use of this in special member functions (address doesn't escape).

But purity isn't an attribute available at the standard declaration level (compiler extensions exist that add purity description on non inline function declaration), so you can't define an ABI based on purity of code that may not be available (code may or may not be inline and available for analysis).

Purity is measured as "certainly pure" or "impure or unknown". The common ground, or upper bound of semantics (actually maximum), or LCM (Least Common Multiple) is "unknown". So the ABI settles on unknown.

Summary:

Some constructs require the compiler to define the object identity.
The ABI is defined in term of classes of programs and not specific cases that might be optimized.

Possible future work:

Is purity annotation useful enough to be generalized and standardized?

Your first example appears misleading. I think you're just making a point in general, but at first I thought you were making an *analogy* to the code in the question. But `void foo(unique_ptr ptr)` takes the class object *by value*. That object has a pointer member, but we're talking about the class object itself being passed by reference. (Because it's not trivially-copyable so its constructor/destructor need a consistent `this`.) That's the real argument and not connected to the first example of passing by reference *explicitly*; in that case the pointer is passed in a register. — Peter Cordes, Nov 30 '19 at 01:51
@PeterCordes "_you were making an analogy to the code in the question._" I did exactly that. "_the class object by value_" Yes I probably should explain that **in general there is no such thing as the "value" of a class object** so by value for a non math type isn't "by value". "_That object has a pointer member_" The ptr-like nature of a "smart ptr" is irrelevant; and so is the ptr member of the "smart ptr". A ptr is just a scalar like an `int`: I wrote an "smart fileno" example which illustrates that "ownership" has nothing to do with "carrying a ptr". — curiousguy, Nov 30 '19 at 04:07
"_That's the real argument and not connected to the first example of passing by reference explicitly_" Morally it's almost like passing by reference since the lifetime starts in the caller! — curiousguy, Nov 30 '19 at 04:10
The value of a class object is its object-representation. For `unique_ptr`, this is the same size and layout as `T*` and fits in a register. Trivially-copyable class objects can be passed *by value* in registers in x86-64 System V, like most calling conventions. This makes a *copy* of the `unique_ptr` object, unlike in your `int` example where the callee's `&i` *is* the address of the caller's `i` because you passed by reference *at the C++ level*, not just as an asm implementation detail. — Peter Cordes, Nov 30 '19 at 04:11
@PeterCordes What the hell is the "object-representation" of x of type `class C`? The values of all the `sizeof(C)` bytes in memory? — curiousguy, Nov 30 '19 at 04:13
Yes. Look it up in the ISO C++ standard. (And correct me if I'm misusing the term :P) — Peter Cordes, Nov 30 '19 at 04:15
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/203366/discussion-between-curiousguy-and-peter-cordes). — curiousguy, Nov 30 '19 at 04:15
Err, correction to my last comment. It's not just making a *copy* of the `unique_ptr` object; it's using `std::move` so it is safe to copy the it because that won't result in 2 copies of the same `unique_ptr`. But for a trivially-copyable type, yes, it does copy the whole aggregate object. If that's a single member, good calling conventions treat it the same as a scalar of that type. — Peter Cordes, Nov 30 '19 at 04:17
BTW, most of this answer is sensible, it's the confusing example that it starts with that I downvoted for. (After you confirmed that it was on purpose and intended as an analogy for the question). If you fixed that, I'd change my vote. It appears to confuse the *value* of the unique_ptr (in this case the pointer value that it's wrapping) with the identity of a single `unique_ptr` object. By analogy, you can make as many copies as you want of an `int*` and they all still point to the same `int`. The separate argument about constructors / destructors needing a consistent `this` is solid. — Peter Cordes, Nov 30 '19 at 09:06
@PeterCordes I will do a rewrite of the answer, adding explanations of where I'm coming from and getting at. I agree the 1st example lacks context and is too dry. — curiousguy, Dec 08 '19 at 02:10
Looks better. Notes: *For C structs compiled as C++* - This is not a useful way to introduce the difference between C++. In C++ `struct{}` is a C++ struct. Perhaps you should say "plain structs", or "unlike C". Because yes, there's a difference. If you use `atomic_int` as a struct member, C will non-atomically copy it, C++ error on the deleted copy constructor. I forget what C++ does on structs with `volatile` members. C will let you do `struct tmp = volatile_struct;` to copy the whole thing (useful for a SeqLock); C++ won't. — Peter Cordes, Dec 08 '19 at 06:51
Ugh, you still kept that nonsense example that takes an `int &i` reference arg. *The reference itself* doesn't need to have an address, it only needs to *hold* the address of the original object. If you compile it; https://godbolt.org/z/6wkFE4 ; you see that the reference aka pointer is only passed in a register (RDI) in x86-64 System V and `int &i` does not have its own address. Only the original `i` that it's a reference *to* has an address. (Note that unary `&i` operator on a reference gives you the address of the referred-to object, i.e. the pointer that the reference holds.) — Peter Cordes, Dec 08 '19 at 06:59
@PeterCordes "_This is not a useful way to introduce the difference between C++_" More work is needed. This isn't perfect but I was tired... — curiousguy, Dec 08 '19 at 07:24

Why can a T* be passed in register, but a unique_ptr cannot?

4 Answers4

With common ABIs, non-trivial destructor -> can't pass in registers

Linked

Related