5

I am watching an old video Jason Turner: Practical Performance Practices.

Right at the beginning there's an example for optimized code in GCC 5.1:

#include <string>

int main() {
    return std::string("a").size();
}

Which compiles to "nothing":

main:
        mov     eax, 1
        ret

However, I was quite surprised to see another output with GCC 13.2:

main:
        sub     rsp, 40
        lea     rax, [rsp+16]
        mov     rdi, rsp
        mov     QWORD PTR [rsp+8], 1
        mov     QWORD PTR [rsp], rax
        mov     eax, 97
        mov     WORD PTR [rsp+16], ax
        call    std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_dispose()
        mov     eax, 1
        add     rsp, 40
        ret

https://godbolt.org/z/45PWox4Gb

Is it a bug that GCC hasn't been able to constant-propagate std::string.size() since 5.1 (12.3 according to the comments)? It used to do it without the -std= argument for -O3. The new version requires -std=c++20 to provide the same output.

Sergey Kolesnik
  • 3,009
  • 1
  • 8
  • 28
  • 3
    Works again with `-std=c++20`, assuming due to `std::string` becoming constexpr. – Quimby Aug 02 '23 at 09:29
  • 1
    It seems the "break" happened in 12.3. It was still optimized in 12.2. – Ted Lyngmo Aug 02 '23 at 09:31
  • 1
    @Quimby well, then there's a question if this has been changed on purpose. And which purpose it was – Sergey Kolesnik Aug 02 '23 at 09:31
  • 1
    @SergeyKolesnik I think it works with C++20 because it is handled by "constexpr optimizations" now while before it had to rely on the generic optimizer. Anyway, I think you should file a regression report for this. – Quimby Aug 02 '23 at 09:32
  • 1
    Maybe `std::string` implementation became more difficult to optimize out? – pptaszni Aug 02 '23 at 09:33
  • 3
    Your question title could be a *lot* more specific, like "is it a bug that GCC hasn't been able to constant-propagate std::string .size() since 5.1?" Obviously the optimizer isn't broken in general, but obviously every version has many different missed-optimizations, some regressions some not. – Peter Cordes Aug 02 '23 at 09:35
  • 1
    @PeterCordes I have updated the wording – Sergey Kolesnik Aug 02 '23 at 09:38
  • 5
    Interesting idea, @pptaszni: Clang even at `-O1` is able to constant-propagate through the same libstdc++ implementation of `std::string`. (It needs `-O2` to constprop through libc++ std::string). Probably what changed is how GCC treats `__attribute__((cold))`, which `main` has implicitly but other function names don't, because GCC has no trouble when you put this code in a differently-named function: https://godbolt.org/z/4vE5oMEbe – Peter Cordes Aug 02 '23 at 09:39
  • 1
    @SergeyKolesnik: Your edit didn't change the title, though! Anyway, turns out the answer really is pretty general, that `main` is optimized less than other functions, do so a more specific isn't really needed. – Peter Cordes Aug 02 '23 at 09:44
  • @PeterCordes I read the duplicate but couldn't really figure out _why_ `main` is optimized less just because it's only going to be called once. It'll compile faster, but unless building a package containing a huge amount of `main`s, I don't think it'll be noticeable, so is compilation speed really the main reason? – Ted Lyngmo Aug 02 '23 at 10:13
  • 1
    @TedLyngmo: No, code-size. It's optimized on the assumption that code in it will run with I-caches and iTLB cold. Both at startup, and when whatever function eventually returns to `main` to exit after running for a long time. – Peter Cordes Aug 02 '23 at 10:21
  • 1
    So the idea is that since `main` is `cold`, we should optimize for size, and therefore we don't attempt to inline as aggressively? But ironically, inlining `_M_dispose` would make the code smaller, since `_M_dispose` optimizes to nothing and this would allow us to avoid the several instructions that construct the `std::string`. – Nate Eldredge Aug 02 '23 at 15:35
  • 1
    @NateEldredge: Yes, apparently GCC's heuristic made a sub-optimal choice in this case, not looking at how much would optimize away after trying inlining. I wonder how much this ever helps for real-world code, and whether it might be time for GCC to reconsider that default of making `main` implicitly `cold`. – Peter Cordes Aug 03 '23 at 02:13

0 Answers0