Is there any study or set of benchmarks showing the performance degradation due to specifying -fno-strict-aliasing in GCC (or equivalent in other compilers)?
-
1Exact duplicate: http://stackoverflow.com/questions/754929/strict-aliasing – GManNickG Aug 04 '09 at 05:39
-
8I found no performance numbers on that discussion. I am looking for some test results/data. Did I miss something? – Carlos Aug 04 '09 at 15:11
-
1FWIW, there are no performance numbers in the accepted answer either. – peterchen Jul 21 '16 at 08:42
4 Answers
It will vary a lot from compiler to compiler, as different compilers implement it with different levels of aggression. GCC is fairly aggressive about it: enabling strict aliasing will cause it to think that pointers that are "obviously" equivalent to a human (as in, foo *a; bar *b = (bar *) a;
) cannot alias, which allows for some very aggressive transformations, but can obviously break non-carefully written code. Apple's GCC disables strict aliasing by default for this reason.
LLVM, by contrast, does not even have strict aliasing, and, while it is planned, the developers have said that they plan to implement it as a fall-back case when nothing else can judge equivalence. In the above example, it would still judge a and b equivalent. It would only use type-based aliasing if it could not determine their relationship in any other way.
In my experience, the performance impact of strict aliasing mostly has to do with loop invariant code motion, where type information can be used to prove that in-loop loads can't alias the array being iterated over, allowing them to be pulled out of the loop. YMMV.
-
1Note that since this answer was posted clang does now have `-fstrict-aliasing` as well. – Keith Smiley Apr 29 '22 at 23:48
What I can tell you from experience (having tested this with a large project on PS3, PowerPC being an architecture that due to it's many registers can actually benefit from SA quite well) is that the optimizations you're going to see are generally going to be very local (scope wise) and small. On a 20MB executable it scraped off maybe 80kb of the .text section (= code) and this was all in small scopes & loops.
This option can make your generated code a bit more lightweight and optimized than it is right now (think in the 1 to 5 percent range), but do not expect any big results. Hence, the effect of using -fno-strict-aliasing is probably not going to be a big influence on your performance, at all. That said, having code that requires -fno-strict-aliasing is a suboptimal situation at best.

- 1,499
- 12
- 24
-
1Because code size == speed? Your PS3 example is neither here nor there. How did it RUN? – Eloff Oct 08 '13 at 18:30
-
2Where do I say it's faster? It might be - it ain't unthinkable at all given that potential loads/stores are omitted - and in any case a smaller executable is preferable on a memory bound machine. So it's here, and it's also there. – nielsj Oct 11 '13 at 11:24
-
1The OP asked for performance implications and you only discussed code size. Then you used an ipso facto argument as to why you won't see a big performance difference. Then you called the Linux kernel a "suboptimal situation at best". I think you can see why you got the downvote. – Eloff Oct 24 '13 at 19:48
-
2In many situations, code which ignores the strict aliasing rule can be easier to read and will yield decent performance even on compilers which (perhaps because of configuration) don't optimize memcpy() particularly efficiently. If strict aliasing doesn't yield a meaningful performance boost, isn't writing ugly cumbersome code sub-optimal compared with writing readable code and disabling strict aliasing? – supercat Oct 13 '15 at 06:41
Here is a link to study conducted in 2004: http://docs.lib.purdue.edu/cgi/viewcontent.cgi?article=1124&context=ecetr concerning, among others, strict aliasing impact on code performance. Figure 2.5 shows relative improvement of 3% to 10%.
Researchers' explanation of performance degradation:
From inspecting the assembly code, we found that the degradation is an effect of the register allocation algorithm. GCC implements a graph coloring register allocator[2, 3]. With strict aliasing, the live ranges of the variables become longer, leading to high register pressure and ‘ spilling. With more conservative aliasing, the same variables incur memory transfers at the end of their (shorter) live ranges as well.
[2] Peter Bergner, Peter Dahl, David Engebretsen, and Matthew T. O’Keefe. Spill code minimization via interference region spilling. In SIGPLAN Conference on Programming Language Design and Implementation, pages 287–295, 1997.
[3] Preston Briggs, Keith D. Cooper, and Linda Torczon. Improvements to graph coloring register allocation. ACM Transactions on Programming Languages and Systems, 16(3):428–455, May 1994.

- 692
- 8
- 18
-
Do you know of any studies that looked at the value of optimizations that would be allowed by strict aliasing *that could not be also be achieved via `restrict` qualifier*? – supercat Sep 26 '17 at 23:26
-
I don't know any, do you know of any optimization that would satisfy these requirements? – SzymonPajzert Sep 28 '17 at 12:06
-
1The `restrict` qualifier cannot be used effectively to tell a compiler that static-duration objects won't alias, and in some cases programs may have multiple possibly-related pointers to the same type which could alias each other. Type-based aliasing would allow some optimizations in those cases that would not otherwise be available. IMHO, such cases could have been better handled by adding better qualifiers (e.g. allow a `register` qualifier in cases where an object may be exposed to outside code, but which could behave as though they get a new address when accessed other than via pointer). – supercat Sep 28 '17 at 14:22
This flag can have impact on the loop-vectorization and thus the performance, as shown in the following example:
// a simple test code
#include<vector>
void add(double *v, double *b, double *c, int *idx, std::vector<int> &v1) {
for(int i=v1[0];i<v1[2];i++){
v[i] = b[i] + c[i];
}
}
If you compile the code in https://godbolt.org/ using GCC11.2 with the flags -O3 -ftree-vectorize -ftree-loop-vectorize -fopt-info-vec-missed -fopt-info-vec-optimized -fno-strict-aliasing
, you will see the message:
<source>:5:22: missed: couldn't vectorize loop
<source>:5:22: missed: not vectorized: number of iterations cannot be computed.
Now if you remove the -fno-strict-aliasing
or replace it with -fstrict-aliasing
, you will see:
<source>:5:22: optimized: loop vectorized using 16 byte vectors
<source>:5:22: optimized: loop versioned for vectorization because of possible aliasing

- 21
- 1