It will be very hard to find out exactly what is happening to your code before and after the optimisation. As you already knew and pointed out yourself, you are trying to go out of bound of an array, which leads to undefined behaviour.
However, you are curious on why it is (apparently) the -O3
flag which "causes" the issue!
Let's start by saying that it is actually the flag -O -fpeel-loops
which is causing your code to re-organise in a way that your error becomes apparent. The -O3
will enable several optimisation flags, among which -O -fpeel-loops
.
You can read here about what the compiler flags are at which stage of optimisation.
In a nutshell, -fpeel-loops
wheel re-organise the loop, so that the first and last couple of iterations are actually taken out of the loop itself, and some variables are actually cleared of memory. Small loops may even be taken apart completely!
With this said and considered, try running this piece of code, with -O -fpeel-loops
or even with -O3
:
#include <iostream>
#include <cstdlib>
using namespace std;
int n,d;
int *f, *c;
void loadData(){
int fdata[] = {7, 2, 2, 7, 7, 1};
int cdata[] = {66, 5, 4, 3, 2};
n = 6;
d = 3;
f = new int[n+1];
c = new int[n];
f[0] = fdata[0];
c[0] = cdata[0];
for (int i = 1;i<n;i++){
cout << f[i];
f[i] = fdata[i];
c[i] = cdata[i];
}
cout << "\nFINAL F[5]:" << f[5]<<endl;
}
int main(){
loadData();
}
You will see that it will print 1
regardless of your optimisation flags.
That is because of the statement: cout << f[i]
, which will change the way that fpeel-loops
is operating.
Also, experiment with this block of code:
f[0] = fdata[0];
c[0] = cdata[0];
c[1] = cdata[1];
c[2] = cdata[2];
c[3] = cdata[3];
c[4] = cdata[4];
for (int i = 1; i<n; i++) {
f[i] = fdata[i];
c[5] = cdata[5];
}
cout << "\nFINAL F[5]:" << f[5] <<endl;
You will see that even in this case, with all your optimisation flags, the output is 1
and not 0
. Even in this case:
for (int i = 1; i<n; i++) {
c[i] = cdata[i];
}
for (int i = 1; i<n; i++) {
f[i] = fdata[i];
}
The produced output is actually 1
. This is, again, because we have changed the structure of the loop, and fpeel-loops
is not able to reorganise it as before, in the way that the error was produced. It's also the case of wrapping it into a while
loop:
int i = 1;
while (i < 6) {
f[i] = fdata[i];
c[i] = cdata[i];
i++;
}
Although on a while
loop -O3
will prevent compilation here because of -Waggressive-loop-optimizations
, so you should test it with -O -fpeel-loops
So, we can't really know for sure how your compiler is reorganising that loop for you, however it is using the so-called as-if rule to do so, and you can read more about it here.
Of course, your compiler takes the freedom o refactoring that code for you basing on the fact that you abide to set rules. The as-if rule for the compiler will always produce the desired output, providing that the programmer has not caused undefined behaviour. In our case, we do indeed have broken the rules by going out of bounds of that array, hence the strategy with which that loop was refactored, was built upon the idea that this could not happen, but we made it happen.
So yes, it is not everything as simple and straightforward as saying that all your problems are created by reading at an unallocated memory address. Of course, it is ultimately why the problem has verfied itself, but you were reading cdata
out of bounds, which is a different array! So it is not as simple and easy as saying that the mere reading out of bounds of your array is causing the issue, as if you do it like that:
int i = 0;
f[i] = fdata[i];
c[i] = cdata[i];
i++;
f[i] = fdata[i];
c[i] = cdata[i];
i++;
f[i] = fdata[i];
c[i] = cdata[i];
i++;
f[i] = fdata[i];
c[i] = cdata[i];
i++;
f[i] = fdata[i];
c[i] = cdata[i];
i++;
f[i] = fdata[i];
c[i] = cdata[i];
i++;
It will work and print 1
! Even if we are definitely going out of bounds with that cdata
array! So it is not the mere fact of reading at an unallocated memory address that is causing the issue.
Once again, it is actually the loop-refactoring strategy of fpeel-loops
, which believed that you would not go out-of-bounds of that array and changed your loop accordingly.
Lastly, the fact that optimisations flags will lead you to have an output of 0
is strictly related to your machine. That 0
has no meaning, is is not the product of an actual logical operation, because it is actually a junk value, found at a non allocated memory address, or the product of a logical operation performed on a junk value, resulting in NaN.