2

After trying, I'm wondering why GCC is able to do DCE on unused malloc or new buffer but not on unused vector?

malloc case: https://godbolt.org/z/xKx5Y1

void fun() {
    int *x = (int *)malloc(sizeof(int) * 100);
}

Resulting assembly:

fun():
        ret

new case: https://godbolt.org/z/66drKr

void fun() {
    int *x = new int[100];
}

Resulting assembly:

fun():
        ret

vector case: https://godbolt.org/z/TWhE1E

void fun() {
    vector<int> x(100);
}

Resulting assembly:

fun():
        sub     rsp, 8
        mov     edi, 400
        call    operator new(unsigned long)
        mov     esi, 400
        lea     rdi, [rax+8]
        mov     rcx, rax
        mov     QWORD PTR [rax], 0
        mov     r8, rax
        mov     QWORD PTR [rax+392], 0
        and     rdi, -8
        xor     eax, eax
        sub     rcx, rdi
        add     ecx, 400
        shr     ecx, 3
        rep stosq
        mov     rdi, r8
        add     rsp, 8
        jmp     operator delete(void*, unsigned long)
Willy
  • 581
  • 2
  • 10
  • 1
    Looks like the constructor and/or destructor side-effects are stopping the optimizer from no-op'ing out the entire thing like it does in the other two cases. Probably could be optimized out, but that optimization opportunity may be deemed an uninteresting situation to spend time to optimize. – Eljay Jan 22 '21 at 02:41
  • But what's the side effect for `vector`? If it's not too complex, why it's considered time consuming to check if this optimization can happen? – Willy Jan 22 '21 at 02:49
  • 1
    The side effect is calling the constructor and destructor for each of the 100 `int` objects (which themselves are no-ops), and the memory allocation (and possible `throw`) and deallocation. – Eljay Jan 22 '21 at 03:11
  • `operator new`/`operator delete` might be replaced by custom version with potential side effects. – Jarod42 Jan 22 '21 at 03:13
  • Standard containers use a separate class (referred to an an allocator) that manages memory allocation and deallocation. So the logic of how vector construction and destruction work is a bit more complicated than a simple `new`/`delete` pair. Although the net effect is the same, the analysis for a compiler (which is examining the code and derived data structures) to identify that is more complicated. If the compiler vendor doesn't consider that case relevant enough, or worth doing, then the design and coding effort to implement the analysis will not be performed. – Peter Jan 22 '21 at 05:17

2 Answers2

6

Since C++14, from new#Allocation:

New-expressions are allowed to elide or combine allocations made through replaceable allocation functions. In case of elision, the storage may be provided by the compiler without making the call to an allocation function (this also permits optimizing out unused new-expression). [..]

Note that this optimization is only permitted when new-expressions are used, not any other methods to call a replaceable allocation function: delete[] new int[10]; can be optimized out, but operator delete(operator new(10)); cannot.

And the default allocator used by std::vector uses the latter, so your suggested optimization is forbidden (since the as-if rule might still apply, but those operators might have been replaced, so it'll be harder to prove that there are no side effects).

If you provide a custom allocator, you might have the expected optimization: Demo.

Jarod42
  • 203,559
  • 14
  • 181
  • 302
  • It seems that GCC also optimized out `operator delete(operator new(10));` but not clang. https://godbolt.org/z/8G3PYe https://godbolt.org/z/d7xPs8 – Willy Jan 22 '21 at 03:00
  • The answer is fantastic. But I'm wondering why GCC can optimize pure `operator new` but not `operator new` in the allocator. Haha – Willy Jan 22 '21 at 10:15
  • 1
    @Willy: one is a transformation allowed, even if some side effects might be removed (as (N)RVO). For the other, it has to prove there are no side effect. So for `operator new`, LTO (Link time Optimization) is required in general. – Jarod42 Jan 22 '21 at 11:05
0

Well, simply because vector is a class. In other words, a possibly-sophisticated software object. When you instantiate an instance of it, its "constructor" must be called. Lots of things happen that the application programmer purposely does not wish to think about – but the object code to do it must be generated, nonetheless.

Mike Robinson
  • 8,490
  • 5
  • 28
  • 41