I have a short program that performs a numerical computation, and obtains an incorrect NaN result when some specific conditions hold. I cannot see how this NaN result can arise. Note that I am not using compiler options that allow the reordering of arithmetic operations, such as -ffath-math
.
Question: I am looking for an explanation of how the NaN result arises. Mathematically, there is nothing in the computation that leads to division by zero or similar. Am I missing something obvious?
Note that I am not asking how to fix the problem—that is easy. I am simply looking for an understanding of how the NaN appears.
Minimal example
Note that this example is very fragile and even minor modifications, such as adding printf()
calls in the loop to observe values, will change the behaviour. This is why I was unable to minimize it further.
// prog.c
#include <stdio.h>
#include <math.h>
typedef long long myint;
void fun(const myint n, double *result) {
double z = -1.0;
double phi = 0.0;
for (myint i = 0; i < n; i++) {
double r = sqrt(1 - z*z);
/* avoids division by zero when r == 0 */
if (i != 0 && i != n-1) {
phi += 1.0 / r;
}
double x = r*cos(phi);
double y = r*sin(phi);
result[i + n*0] = x;
result[i + n*1] = y;
result[i + n*2] = z;
z += 2.0 / (n - 1);
}
}
#define N 11
int main(void) {
// perform computation
double res[3*N];
fun(N, res);
// output result
for (int i=0; i < N; i++) {
printf("%g %g %g\n", res[i+N*0], res[i+N*1], res[i+N*2]);
}
return 0;
}
Compile with:
gcc -O3 -mfpmath=387 prog.c -o prog -lm
The last line of the output is:
nan nan 1
Instead of NaN, I expect a number close to zero.
Critical features of the example
The following must all hold for the NaN output to appear:
Compile with GCC on an x86 platform. I was able to reproduce with this GCC 12.2.0 (from MacPorts) on macOS 10.14.6, as well as with GCC versions 9.3.0, 8.3.0 and 7.5.0 on Linux (openSUSE Leap 15.3).
I cannot reproduce it with GCC 10.2.0 or later on Linux, or GCC 11.3.0 on macOS.
Choose to use x87 instructions with
-mfpmath=387
, and an optimization level of-O2
or-O3
.myint
must be a signed 64-bit type.Thinking of
result
as an n-by-3 matrix, it must be stored in column-major order.No
printf()
calls in the main loop offun()
.
Without these features, I do get the expected output, i.e. something like 1.77993e-08 -1.12816e-08 1
or 0 0 1
as the last line.
Explanation of the program
Even though it doesn't really matter to the question, I give a short explanation of what the program does, to make it easier to follow. It computes x
, y
, z
three-dimensional coordinates of n
points on the surface of a sphere in a specific arrangement. z
values go from -1 to 1 in equal increments, however, the last value won't be precisely 1 due to numerical round-off errors. The coordinates are written into an n
-by-3 matrix, result
, stored in column-major order. r
and phi
are polar coordinates in the (x, y) plane.
Note that when z
is -1
or 1
then r
becomes 0. This happens in the first and last iteration steps. This would lead to division by 0 in the 1.0 / r
expression. However, 1.0 / r
is excluded from the first and last iteration of the loop.