1

I am having a problem with some C++20 code that is crashing when compiled in release (-O3, clang 15), and it's very tricky to debug due to a number of obfuscation techniques being applied on the final executable which makes it extremely difficult to see the actual x86_64 ASM being executed.

Still, I have managed to narrow the crash down to the following pseudo code...

for (std::wstring& data : datas)
{
    auto id = std::format(L"foo_{0}", data);

    do_something(id, std::move(data));
}

...where the signature for do_something looks like this:

void do_something(const std::wstring&, std::wstring);

Basically, the move constructor of the std::wstring for the second argument should be invoked already when setting up the call, I assume.

Now, this code works fine with -O2, but when enabling -O3 it breaks, and I have a hunch.

Reading through the C++ standard, we can see the following...

Order of evaluation of any part of any expression, including order of evaluation of function arguments is unspecified (with some exceptions listed below). The compiler can evaluate operands and other subexpressions in any order, and may choose another order when the same expression is evaluated again.

I have also read that C++ compilers are free to optimize away locals when they have no side effects (but have not located this specific rule in the standards).

Could it be that, since std::format in itself has no side effect in the above code, the compiler "inlines" it into the function call when using -O3? And when it does, the order of evaluation is undefined, so that the move actually happens before the call to std::format is called?

Basically what I am asking is: would the optimization I describe be permitted by the C++ standard?

Znurre
  • 215
  • 4
  • 8
  • 4
    Since when parameter evaluation order ever has been guaranteed? – πάντα ῥεῖ Aug 06 '23 at 10:05
  • @πάνταῥεῖ it's not necessarily only about the parameter evaluation order (as mentioned, I am aware that it's undefined), but rather if the compiler is allowed to do the full optimization that I am mentioning, making the non-deterministic evaluation order a problem. – Znurre Aug 06 '23 at 10:07
  • 4
    No optimization can change behaviour of well-defined code, including making it undefined - think about what you imply "making well-defined code undefined via optimization", that's not optimization at all. – Quimby Aug 06 '23 at 10:08
  • What compiler options have you used exactly? – tevemadar Aug 06 '23 at 10:24
  • @tevemadar It's a big code project, so I cannot easily share all options, but the only explicit optimizations enabled are -O3 (and ThinLTO, but these two functions are in the same translation unit). – Znurre Aug 06 '23 at 10:34
  • 2
    I had a bug in some code that occurred with `-O3` but not with `-O2`, using GCC `g++`. For the O3 [optimizations options](https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html), I disabled them one-by-one until I found the one that was causing the issue, and that helped me track down the naughty code. Note: not all optimizations *necessarily* have a specific optimization option flag. – Eljay Aug 06 '23 at 11:39
  • Notice that your data in `datas` would be moved... Can error occurs after? – Jarod42 Aug 06 '23 at 12:46
  • [OT]: *"data"* is already plural, *"datas"* is incorrect. `string`/`strings` would be better names. – Jarod42 Aug 06 '23 at 12:48
  • @Jarod42 This is just pseudo code describing the problem. In the real code, `datas` is never used after its elements have been moved. The purpose of this question was not necessarily to find a solution to the problem per se, more so for me to get a confirmation that my understanding of C++ is correct, as I started questioning myself when I found out that something about this code was causing the crash. – Znurre Aug 06 '23 at 12:57
  • Yeah, I think you are barking up a wrong tree here. C++ may be weird but it isn't *that* weird. Basic principles still apply like "successive statements are sequenced in program order". A statement sees all the side effects of previous statements, and none of the side effects of later statements. – Nate Eldredge Aug 13 '23 at 14:00

1 Answers1

5

Could it be that, since std::format in itself has no side effect in the above code, the compiler "inlines" it into the function call when using -O3?

No. Or if it does, it will keep the order of instructions such that std::format is called before the move (the second parameter's move constructor, not the std::move which is not an instruction). Otherwise, what is well-defined behavior would turn into undefined behavior, and that is not something a compiler is allowed to do in normal situations.

A compiler can provide options that break standard compliance. This is the case with -Ofast for example. However, -O3 does not break standard compliance, and as far as I know, there is no GCC option that would result in what you describe.

Nelfeal
  • 12,593
  • 1
  • 20
  • 39