I wrote a small program to compute the Euclidean norm of a 3-coordinate vector. Here it is:
#include <array>
#include <cmath>
#include <iostream>
template<typename T, std::size_t N>
auto norm(const std::array<T, N>& arr)
-> T
{
T res{};
for (auto value: arr)
{
res += value * value;
}
return std::sqrt(res);
}
int main()
{
std::array<double, 3u> arr = { 4.0, -2.0, 6.0 };
std::cout << norm(arr) - norm(arr) << '\n';
}
On my computer, it prints -1.12323e-016
.
I know that floating point types should be handled with care. However, I thought that floating-point operations were at least somehow deterministic. This article about floating-point determinism states that:
Some of the things that are guaranteed are the results of addition, subtraction, multiplication, division, and square root. The results of these operations are guaranteed to be the exact result correctly rounded (more on that later) so if you supply the same input value(s), with the same global settings, and the same destination precision you are guaranteed the same result.
As you can see, the only operations that this program does on floating-point values are addition, subtraction, multiplication and square root. If I trust the article I quoted above, considering that it runs in a single thread and that I do not change the rounding modes or anything else floating-point related, I thought that norm(arr) - norm(arr)
would be 0
since I do exactly the same operations on the same values twice.
Are my assumptions wrong, or is this a case of compiler not strictly conformant with regard to IEEE floating-point math? I am currently using MinGW-W64 GCC 4.9.1 32 bits (I tried every optimization level from -O0
to -O3
). The same program with MinGW-W64 GCC 4.8.x displayed 0
, which is what I would have expected.
EDIT: I disassembled the code. I won't post the whole generated assembly because it is too big. However, I believe that the relevant part is here:
call ___main
fldl LC0
fstpl -32(%ebp)
fldl LC1
fstpl -24(%ebp)
fldl LC2
fstpl -16(%ebp)
leal -32(%ebp), %eax
movl %eax, (%esp)
call __Z4normIdLj3EET_RKSt5arrayIS0_XT0_EE
fstpl -48(%ebp)
leal -32(%ebp), %eax
movl %eax, (%esp)
call __Z4normIdLj3EET_RKSt5arrayIS0_XT0_EE
fsubrl -48(%ebp)
fstpl (%esp)
movl $__ZSt4cout, %ecx
call __ZNSolsEd
subl $8, %esp
movl $10, 4(%esp)
movl %eax, (%esp)
call __ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_c
movl $0, %eax
movl -4(%ebp), %ecx
.cfi_def_cfa 1, 0
leave
As you can see, __Z4normIdLj3EET_RKSt5arrayIS0_XT0_EE
is called twice, therefore, it is not inlined. I don't understand the whole thing though, and can't tell what is the problem.