1

In Matlab, I compute a rotation of a collection of 2D points in two ways: one by a regular matrix-matrix product, and the other by iterative vector-matrix products, as follows.

>> points = % read data from some file, Nx2 matrix
>> R = [cosd(-18), sind(-18); -sind(-18), cosd(-18)];  % My rotation matrix
>> prod1 = points * R;
>> numpt = size(points, 1);
>> for k=1:numpt, prod2(k,:) = points(k,:) * R; end;

I am using a "regular" (intel-based) PC with Windows 10 OS.

It turns out that on some computers, prod1 ~= prod2, and other computers, prod1 == prod2. This can be checked by

>> max(max(abs(prod2 - prod1)))

ans =

   1.1102e-16

This difference is equal to 0 on the "weaker" computers and nonzero on my "powerful" computer.

I suppose that the reason for this happening on some computers but not others is that where it happens, there is some H/W acceleration of matrix multiplication (maybe involving madd ternary operations, notorious for this kind of difference).

Is this some known issue, like a "bug"? Is there a workaround, for example to disable or suspend this sort H/W acceleration?

I am seeking to obtain identical outcomes of the computation on different computers, as part of unit test. I can settle for "near equality". But I should not if I can get true equality.

EDIT 1

I stress that the core issue is that the exact same syntactical expression produces different results on different computers, and the apparent cause is different computational optimizations done on different computers. Bit identity is a requirement that cannot be waved off. I would like both platforms, which are 64-bit intel-based Windows 10, to compute exactly the same outcome for exactly the same input and expression.

EDIT 2

I have tried to extract specific elements of the input on which the outcome differs (i.e. prod2(k,:) ~= prod1(k,:). I collected them into a matrix to repeat the computation just on them. And wow! This time the outcome is not the same as in prod1. It's more similar to the explicitly iterative case.

So I'm sorry, I can't quite reproduce the problem in the scope of this discussion. But we can discuss some more.

I'll check on Zhang's thread issue. And I will even try to repeat the computation several times, see if the differences appear at the same location. If it's about multithreading, I expect the differences to show up in non-deterministic indexes.

Ofri Sadowsky
  • 171
  • 1
  • 7
  • 3
    "true equality" does not exist below machine precision, 1e-16 is a smaller error than can be accurately represented using a numeric double, see [`eps('double')`](https://uk.mathworks.com/help/matlab/ref/eps.html) and https://stackoverflow.com/q/686439/3978545 – Wolfie May 15 '23 at 15:42
  • 1
    Don't expect identical results, expect the same results but with different rounding errors. – Cris Luengo May 17 '23 at 03:31
  • I'm afraid that both comments missed the issue. The core problem is that the exact same syntactical expression with the same input is evaluated to different outcomes on different platforms. The apparent cause is that one platform optimizes the evaluation differently from the other. Bit identity is sometimes, not always, a requirement. It should not be waved away. What I seek is an option to disable the optimization while testing the code, so that different platforms will produce **equal** results. – Ofri Sadowsky May 17 '23 at 05:32
  • It is not about optimization, it is about computation order and internal details. I can write code in C and get tiny differences in the result depending on which compiler I use, which flags I pass to the compiler, which platform is targeted, etc. With floating point computations you should not expect identical results, ever. Take that idea out of your mind, it will not happen. *Equal* results you should get, *identical* you will not. – Cris Luengo May 17 '23 at 13:33
  • I'm not sure what you mean precisely. My current situation is that I have a unit test which is not portable across different hosts that have (almost) the same OS and processor type. I can make it work, but I don't think it's supposed to be this way. The same operation with the same input produces one result on one host, and another on the other. – Ofri Sadowsky May 17 '23 at 13:59
  • do you really need to dive down to e-16? just round or floor to the decimals needed, problem solved, next problem? – John BG May 19 '23 at 01:26
  • I don't really need the e-16 precision. I really need repeatable results. I will pay the price of rounding, which seems to be the cheapest at the moment. I'll write more details in a separate answer. – Ofri Sadowsky May 22 '23 at 07:05
  • “I have a unit test which is not portable” Ah, that is the problem! In unit tests you should not compare two floating-point arrays for equality, every comparison needs a tolerance. – Cris Luengo May 22 '23 at 14:09

3 Answers3

1

The official response from MathWorks tech support points to these links:

  1. An answer to a similar question explaining that the details of computation order are in the LAPACK library, and, therefore, not tunable from Matlab.
  2. An example from the Symbolic Toolbox that shows how to improve overall precision, with the price being the Symbolic Toolbox itself, a new code that uses it, and computation time increase.

Given this, my preference is to round the computation outcome. As explained in the conversations, my goal is repeatability, not precision. And it looks like the most cost effective answer here is rounding.

Ofri Sadowsky
  • 171
  • 1
  • 7
0

Each variable in each step need to be verified to answer all the questions. In general, the result should be the same for well-defined math functions across platforms. Meanwhile, you can leverage on the symbolic toolbox to increase precision of numeric calculations.

X Zhang
  • 1,081
  • 7
  • 17
  • You may read the recent edit of the question. The same syntactical expression with the same input produces different results on different platforms. – Ofri Sadowsky May 17 '23 at 05:43
  • @OfriSadowsky Can you pinpoint when the difference first appeared along the processing pipeline? e.g. after Line 1, or Line 2, etc. Also matlab can be configured to various precision, can you reset both computers to factory defaults? – X Zhang May 17 '23 at 06:28
  • The difference appears between computing an expression of the form `M * R`, where `M` is a matrix of size Nx2 and `R` is a matrix of size 2x2, and computing `P_i = M_i * R`, where i is the row index and the computation is via explicit iteration over the rows instead of whole matrix multiplication. The outcome of the row-wise iteration on host H1 differs from the outcome of the whole matrix multiplication on H1, but it is equal to the outcome of whole matrix multiplication on host H2. Equality meaning bit identity here. I will resort to epsilon comparison if there is no other choice. – Ofri Sadowsky May 17 '23 at 08:14
  • @OfriSadowsky Oh. Similar to [this one](https://www.mathworks.com/matlabcentral/answers/723553-matrix-multiplication-accuracy-problem). Epsilon might be the way to go. Another unlikely possibility is the [mtimes](https://www.mathworks.com/help/matlab/ref/mtimes.html) function being deeply overloaded by unusual toolboxes. `which mtimes` might help. – X Zhang May 17 '23 at 08:27
  • Just to clarify wrt Zhang's question: The first block in my question creates two objects: `prod1` and `prod2`, both matrices of equal size and almost equal content. The difference shows up in my second code block. I will prepare some specific data for which the differences show up. – Ofri Sadowsky May 17 '23 at 08:29
  • ` >> which mtimes built-in (C:\Program Files\MATLAB\R2022a\toolbox\matlab\ops\@char\mtimes) % char method ` On both computers. – Ofri Sadowsky May 17 '23 at 08:32
  • @OfriSadowsky I'm chasing wild goose here, purely imaginative. Assuming the powerful computer has more threads available, which somehow affected the optimization. If you limit the threads with [maxNumCompThreads](https://www.mathworks.com/help/matlab/ref/maxnumcompthreads.html), will it change the result? – X Zhang May 17 '23 at 08:44
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/253699/discussion-between-ofri-sadowsky-and-x-zhang). – Ofri Sadowsky May 17 '23 at 11:22
0

Floating-point operations incur a rounding error. The order of operations, which changes depending on platform, compiler, compiler options, etc., affects the rounding error. So, internal MATLAB operations will produce slightly different results from one version of MATLAB to another, and from one platform to another, and between different ways of expressing the same value. You simply cannot expect two floating-point arrays to be identical.

So, instead of comparing two arrays using the == operation, which requires them to be identical, you want to compare them with a tolerance. One way to do so is:

tol = 10
all( abs(prod1 - prod2) < tol * eps(max(abs(prod1),[],'all')) ,'all')
%    ^^^^^^^^^^^^^^^^^^   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
%    absolute difference              tolerance

Here we are ensuring that the absolute difference between the two arrays is less than some tolerance, for all array elements.

The tolerance is constructed by taking the eps for the maximum absolute value of the array, and scaling that with a parameter tol that you set yourself. eps(x) is the smallest value you can add to x such that it is no longer equal to x. That is, it is the precision of the value x. A rounding error is always within eps, but rounding errors can accumulate, so you might want to start with tol = 10 as in the example above.

Since you are talking about unit tests, you might be using MATLAB's Testing Framework, in which case you'd use verifyEqual.

Cris Luengo
  • 55,762
  • 10
  • 62
  • 120