4

Consider the following code:

#include <algorithm>
#include <numeric>

int main() {
    int* v = new int[1000];
    std::fill(v,v+1000,0);
    std::iota(v,v+1000,0);
    int res = v[999];
    delete[] v;
    return res;
}

I would expect the compiler to see that the std::fill is not needed because the array is immediately overwritten by std::iota. However, with either Clang or GCC (with -O3), the assembly code is different if I remove the std::fill instruction (it is shorter, and from my understanding, it is because in one case instructions are emitted from std::fill). See here and here.

Is something really preventing compilers to optimize it away from a correctness point of view? Or is it a missing optimization, and if so, why is it so complicated to figure it out?

I stumbled accross the issue because if one wants to create a vector with a size, and then assign values to it, the default constructor will zero out the vector, even if it is not needed, see this question

Jorengarenar
  • 2,705
  • 5
  • 23
  • 60
Bérenger
  • 2,678
  • 2
  • 21
  • 42
  • Missing opportunity I would say. – Jarod42 Sep 08 '21 at 13:07
  • Short answer: it's hard to optimize a code like that. For XY problem, if zero-ing a vector actually is a performance issue for you, you might want to use std::make_unique_for_overwrite from C++20: https://en.cppreference.com/w/cpp/memory/unique_ptr/make_unique – Kaznov Sep 08 '21 at 13:08
  • If you look at the gcc optimization options: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html you'll see that there's a lot of `max-...` values. a 1000-long loop is pretty big, so there's probably one or a few of these that needs to be bumped up for this to be optimized. –  Sep 08 '21 at 13:08
  • 1
    FWIW, the threshold appears to be exactly 100: https://godbolt.org/z/zar8M6Eq5 –  Sep 08 '21 at 13:18
  • @Frank Unfortunately it's not that simple... https://godbolt.org/z/7vqdboT3o I honestly don't understand what is the optimizer doing here. Another thing, it depends, if the returned value is one of the last elements of the array or not. – Kaznov Sep 08 '21 at 13:22
  • @Frank Interesting. At 101, it does everything at compile time and correctly returns 100 with no computation ??! – Bérenger Sep 08 '21 at 13:24
  • I guess 100 is okay, 101 is really good, but do not use 1000 with compilers -_- – Bérenger Sep 08 '21 at 13:25
  • 1
    Hmm.... Also note that the "Compiler" is not an AI driven program. It **can** optimize things, but it **must** not do! That's where a good programmer acts! – Const Sep 08 '21 at 13:44
  • If you want this optimized, then get some people together and do a code inspection or walk through. :-) – Thomas Matthews Sep 08 '21 at 14:26
  • @Const even if it means that I can't use `std::vector`? cf. https://stackoverflow.com/questions/96579/stl-vectors-with-uninitialized-storage – Bérenger Sep 08 '21 at 14:57

0 Answers0