6

Consider the following simplified example:

#include <utility>
#include <memory>

int test_lack()
{
    auto lam = []
    {
        return 10;
    };

    // move lam to the heap
    void* ptr = new decltype(lam)(std::move(lam));

    // retrieve the result of lam
    int res = (*static_cast<decltype(lam)*>(ptr))();

    if (ptr) // important
        delete static_cast<decltype(lam)*>(ptr);

    return res;
}

GCC 5.2 -O3 compiles it into:

test_lack():
  sub   rsp, 8
  mov   edi, 1
  call  operator new(unsigned long)
  mov   esi, 1
  mov   rdi, rax
  call  operator delete(void*, unsigned long)
  mov   eax, 10
  add   rsp, 8
  ret

Clang 3.7 -O3 optimizes this into:

test_lack():
  mov   eax, 10
  ret

Why is Clang capable of optimizing the given code whereas GCC fails?

Are there any compiler flags or code improvements which could allow GCC to perform more aggressive optimization to the given code?


Just a short addition to the answer:

GCC 5.2 doesn't implement standard proposal N3364 which means it can't optimize the given call to operator new out like Clang does it.

Naios
  • 1,513
  • 1
  • 12
  • 26
  • after testing it appears only clang 3.7 does this. clang 3.6 returns the same assembly as gcc5.2 – NathanOliver Sep 24 '15 at 12:37
  • @sabbi Good catch there, i allocated the lambda on the stack again and gcc optimizes it as expected: https://goo.gl/pmQ0T6 . – Naios Sep 24 '15 at 12:49
  • I think the null test before delete is unnecessary. – Peter - Reinstate Monica Sep 24 '15 at 12:58
  • I took a view over my original code and it seems like GCC optimizes the code if std::malloc and std::free is used to allocate memory whithout doing the null check (which means if the code is correct it fails and it optimizes it if the code could cause sigsev's): https://goo.gl/o8v9g3 – Naios Sep 24 '15 at 13:08

0 Answers0