I have some problems while porting some complex under macOS/arm64 and ended up with the following trivial code to exhibit the different behavior w.r.t. macOS/x86_64 (using native osx/arm64 clang version 14.0.6 from conda-forge, and cross compiling for x86_64):
#include "assert.h"
#include "stdio.h"
int main()
{
double y[2] = {-0.01,0.9};
double r;
r = y[0]+0.03*y[1];
printf("r = %24.26e\n",r);
assert(r == 0.017);
}
The results on arm64 is
$ clang -arch arm64 test.c -o test; ./test
Assertion failed: (r == 0.017), function main, file test.c, line 9.
r = 1.69999999999999977517983751e-02
zsh: abort ./test
while the result on x86_64 is
$ clang -arch x86_64 test.c -o test; ./test
r = 1.70000000000000012212453271e-02
$
The test program has also been compiled/run on a x86_64 machine, it yields the same result as above (cross compiled on arm64 and run with Rosetta).
In fact it doesn't matter that the arm64 result is not bitwise equal to 1.7 parsed and stored as a IEEE754 number, but rather the different value of the expression w.r.t. x86_64.
Update 1:
In order to check eventual different conventions (e.g. rounding mode), the following program has been compiled and run on both platforms
#include <iostream>
#include <limits>
#define LOG(x) std::cout << #x " = " << x << '\n'
int main()
{
using l = std::numeric_limits<double>;
LOG(l::digits);
LOG(l::round_style);
LOG(l::epsilon());
LOG(l::min());
return 0;
}
it yields the same result:
l::digits = 53
l::round_style = 1
l::epsilon() = 2.22045e-16
l::min() = 2.22507e-308
hence the problem seems to be elsewhere.
Update 2:
If it can help: under arm64 the result obtained with the expression is the same as the one obtained by calling refBLAS ddot with vectors {1,0.03}
and y
.
Update 3:
The toolchain seems to be the cause. Using the default toolchain of macOS 11.6.1:
mottelet@portmottelet-cr-1 ~ % clang -v
Apple clang version 13.0.0 (clang-1300.0.29.30)
Target: arm64-apple-darwin20.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
gives the same results for both architecture ! So the problem seems to be in the actual toolchain I am using: I use the version 1.5.2 of conda package cxx-compiler
(I need conda as a package manager because the application I am building has a lot of dependencies that conda provides me).
Using -v
shows a bunch of compilation flags, which one would be eventually incriminated ?