In Matlab, I compute a rotation of a collection of 2D points in two ways: one by a regular matrix-matrix product, and the other by iterative vector-matrix products, as follows.
>> points = % read data from some file, Nx2 matrix
>> R = [cosd(-18), sind(-18); -sind(-18), cosd(-18)]; % My rotation matrix
>> prod1 = points * R;
>> numpt = size(points, 1);
>> for k=1:numpt, prod2(k,:) = points(k,:) * R; end;
I am using a "regular" (intel-based) PC with Windows 10 OS.
It turns out that on some computers, prod1 ~= prod2
, and other computers, prod1 == prod2
. This can be checked by
>> max(max(abs(prod2 - prod1)))
ans =
1.1102e-16
This difference is equal to 0
on the "weaker" computers and nonzero on my "powerful" computer.
I suppose that the reason for this happening on some computers but not others is that where it happens, there is some H/W acceleration of matrix multiplication (maybe involving madd
ternary operations, notorious for this kind of difference).
Is this some known issue, like a "bug"? Is there a workaround, for example to disable or suspend this sort H/W acceleration?
I am seeking to obtain identical outcomes of the computation on different computers, as part of unit test. I can settle for "near equality". But I should not if I can get true equality.
EDIT 1
I stress that the core issue is that the exact same syntactical expression produces different results on different computers, and the apparent cause is different computational optimizations done on different computers. Bit identity is a requirement that cannot be waved off. I would like both platforms, which are 64-bit intel-based Windows 10, to compute exactly the same outcome for exactly the same input and expression.
EDIT 2
I have tried to extract specific elements of the input on which the outcome differs (i.e. prod2(k,:) ~= prod1(k,:)
. I collected them into a matrix to repeat the computation just on them. And wow! This time the outcome is not the same as in prod1
. It's more similar to the explicitly iterative case.
So I'm sorry, I can't quite reproduce the problem in the scope of this discussion. But we can discuss some more.
I'll check on Zhang's thread issue. And I will even try to repeat the computation several times, see if the differences appear at the same location. If it's about multithreading, I expect the differences to show up in non-deterministic indexes.