Are compilers optimizing out all temporary objects?

Question

Some people say compilers are smarter than humans in most cases and will optimize lots of stuff than we can explicitly do so. I wanna know if the compiler is optimizing this out. The code below have interesting result but didn't have error. But have a serious problem.

//will this move or copy construct?
#include <iostream>

class A
{
public:

    A()
    {
        std::cout << "Constructed A.\n";
    }

    ~A()
    {
        std::cout << "A destroyed.\n";
    }

};

class B
{
private:
    A m_A;

public:
    B(A someA):m_A{someA}
    {

    }
};

int main()
{
    B oneB{A()};

    return 0;
}

This code print on Windows 10 using Clang++ 13.0.1 with -fexceptions -O3 -Wall -g -std=c++20 -v -c

Constructed A.
A destroyed.
A destroyed.

Why is A destroyed twice but constructed only once? This happen even when constructing B using constant reference. This is getting ridiculous. I'm still learning C++ and never been in any projects or whatsoever. I'm asking something I'm confused about while learning online.

`Why is A destroyed twice but constructed only once?` It is not constructed only once, you're just not getting any output from move nor copy constructor — tkausl, May 24 '22 at 09:09
Because you did not declare any copy or move constructor for A, the compiler adds one for you. And it does not print. — user253751, May 24 '22 at 09:10
the optimizer is subject to the as-if-rule which says the optimized program must have same observable behavior. See https://stackoverflow.com/questions/15718262/what-exactly-is-the-as-if-rule There are things like copy-elision that can change the observable behavior though. — 463035818_is_not_an_ai, May 24 '22 at 09:13

Jason · Answer 1 · 2022-05-24T09:50:24.447

Why is A destroyed twice but constructed only once?

The A that is destroyed the second time is different from the A destroyed the first time. You can confirm/verify this by adding a copy constructor to your class A as shown below. The copy constructor will be used to initialize m_A in the m_A{someA} of the member initializer list.

class A
{
public:

    //other members as before 

    //copy constructor added 
    A(const A&)
    {
    std::cout<<"copy ctor"<<std::endl;
    }

};

Demo

After adding the copy constructor the output of the program will look like:

Constructed A.
copy ctor
A destroyed.
A destroyed.

Note that you're using C++20 which has mandatory copy elison(from C++17 & onwards). This means that when you wrote:

B oneB{A()};

in C++20(&C++17) there is no creation of a temporary object and the parameter A someA of B::B(A) is created directly without having to copy any temporary.

But prior to C++17, there was non-mandatory copy elison. This means that a temporary A object will be created which will be copied/moved to the parameter named someA. But the compilers were allowed to elide this copy/move construction as an optimization.

To verify this, you can pass the -fno-elide-constructors flag to the compiler in C++11 or C++14(which will tell the compiler to not do the optimization involving copy/move construction) and you will see that this is indeed what happens as shown in the given demo link:

Demo C++11 with fno-elide-constructors.

The output of the program with C++11 and -fno-elide-constructors flags will be:

Constructed A.
copy ctor
copy ctor
A destroyed.
A destroyed.
A destroyed.

Note that the flag -fno-elide-constructors will only affect the output of your program with C++11 using pre-C++17 version of the standard. From C++17 and onwards, there will be no extra call to the copy constructor. Demo C++17

One question sir, does that mean compiler will not optimize any object without a explicit move constructor? — Blake, May 24 '22 at 09:25
Actually since you're using `C++20` so here(in your given sample) there is no optimization. But if you were to use `C++11` or `C++14` then you can even see the optimization empirically for confirming that when you wrote `B oneB{A()};` a temporary `A` object is created and its copy is passed to `B::B(A)` ctor. For verifying this see [this](https://wandbox.org/permlink/qT34Tq5DxkvC34iE) demo that i created for C++11 using the `-fno-elide-constructors` flag. The thing to note here is that the copy ctor will be called 2 times as i said. So there will in total 3 `A` construction and destruction. — Jason, May 24 '22 at 09:32
@Blake I have added some more information regarding optimization in my answer. Check out my updated answer. — Jason, May 24 '22 at 09:44

score 5 · Answer 2 · answered May 24 '22 at 09:10

5

You are wrong, it is not constructed once, but twice as well: First time as temporary object, second time when copying into the B object.

However the copy constructor used for is generated implicitly and doesn't provide any output. Add one explicitly and you'll see:

A(A const&)
{
    std::cout << "Copied A.\n";
}

answered May 24 '22 at 09:10

Aconcagua

24,880
4
34
59

I see, so the compiler will not optimize any objects that don't have a move constructor? – Blake May 24 '22 at 09:20
It won't do so on a move constructor either! Moving involves two objects just as copying does as well. And moving must be implemented appropriately as well, unless you have only primitive types involved; then, and in a few other cases, moving is equivalent to copying. Moving is useful if you have two objects x and y carrying dynamically allocated data (like strings) and you can say: Hey, I don't care for x's state after moving any more, all I want is y holding the data of x afterwards – so if strings, x might remain empty while y then holds the data x formerly did. – Aconcagua May 24 '22 at 09:26
1

@Blake Because the constructor and destructor have side effects (printing), they cannot be optimized out. To see the effect of optimization, you should look at the generated assembly. – VLL May 24 '22 at 09:55
@Blake: Also note: Because you made a user-defined destructor, you disabled the implicitly-generated move constructor and move assignment operators (there's still an implicitly-generated copy constructor and copy assignment operator, but even that is deprecated since C++11). By defining a user-defined destructor for `A`, you blocked the default move constructor/move assignment operator. You could reenable the compiler generated move constructor/assignment by declaring `A(A&&) = default;` and `A& operator=(A&&) = default;` to allow the compiler to use moves when copy elision isn't available. – ShadowRanger Aug 30 '22 at 14:24

score 0 · Answer 3 · answered May 24 '22 at 13:04

The compiler isn't allowed to change observable behaviour except in the case of copy-elision. See other answers for details on how the ISO C++ standard allows changes to visible behaviour that way, and what rules govern implicit definition and use of constructors your code didn't define.

You need to look at the asm to see if it optimized out an object; in your case it will have inlined the constructor and just called std::cout::operator<< to maintain the visible side-effects, while not actually reserving any storage space on the stack for your objects of type A or B. sizeof(A) == 1 but it's only padding, just because each object needs to have its own identity (and its own address). Even if there had been an int member = 42; member in either A or B, since nothing ever reads it only an un-optimized debug build would actually reserve stack space for it and store a 42.

In C++, every object has an address, including an int. Except register objects, but that's now deprecated and even removed. (In C++ (and C) terminology, even a primitive type is an object.) Why does clang produce inefficient asm with -O0 (for this simple floating point sum)? shows an example of optimization or not.

We can see this just as well with local int variables, no need to mess around with constructors.

int foo(int a){
    int b = a+1;
    int c = b*4;
    int d = c >> 7;
    return d;
}

In a debug build, every int object in the C++ abstract machine gets its own address, and is actually stored and reloaded to stack space. Nobody wants that except for debugging (or if you do, use volatile for that one object), so an optimized build for x86-64 looks like this, on the Godbolt compiler explorer

# GCC11.3 -O3
foo(int):                  # first arg in EDI per System V calling convention
        lea     eax, [4+rdi*4]
        sar     eax, 7
        ret                # return value in EAX

By contrast, with GCC with the default -O0, and -fverbose-asm to comment it with the names of C++ objects. I've added comments to the right of those. Intel syntax is operation dst, src. Square brackets is an addressing mode, dereferencing a pointer in a register. (In this case just the frame pointer to access space in the stack frame of the current function.)

foo(int):              # demangled asm symbol name
# prologue setting up RBP as a frame pointer
        push    rbp     #
        mov     rbp, rsp  #,
  # It doesn't need to sub rsp, 24  because x86-64 SysV has a red zone below the stack pointer
        mov     DWORD PTR [rbp-20], edi   # a, a       # spill incoming register arg

        mov     eax, DWORD PTR [rbp-20]   # tmp87, a   # reload it
        add     eax, 1    # tmp86,
        mov     DWORD PTR [rbp-4], eax    # b, tmp86   # int b = a+1;

        mov     eax, DWORD PTR [rbp-4]    # tmp91, b
        sal     eax, 2    # tmp90,
        mov     DWORD PTR [rbp-8], eax    # c, tmp90   # int c = b*4;

        mov     eax, DWORD PTR [rbp-8]    # tmp95, c
        sar     eax, 7    # tmp94,
        mov     DWORD PTR [rbp-12], eax   # d, tmp94   # int d = c >> 7;

        mov     eax, DWORD PTR [rbp-12]   # _5, d      # return d;

        pop     rbp       #                            # epilogue
        ret

Are compilers optimizing out all temporary objects?

3 Answers3