OpenCV double matrix division by scalar produces incorrect result

Question

I'm curious if anyone has been able to get correct results when dividing matrices (with double values) with a scalar (double). I noticed OpenCV does not give correct (well, "exact") results when I was trying to track down the source of some inconsistencies between an algorithm in MATLAB and one being reproduced in C++. Here is a minimal example of the problem I'm seeing:

cv::Mat some_matrix(1, 1, CV_64FC1, cv::Scalar::all(95));
cv::Mat some_matrix_div = some_matrix / 235.0;
printf(
        "Expected: %.53g\n"
        "OpenCV  : %.53g\n",
        some_matrix.at<double>(0,0) / 235.0,
        some_matrix_div.at<double>(0,0) );

After running I see

Expected: 0.40425531914893614304773450385255273431539535522460938
OpenCV  : 0.404255319148936198558885735110379755496978759765625

The first is what the value should be (and what you'll get if you perform the double precision division of 95/235 in C++ or MATLAB) but the second is what OpenCV produces upon using the division operator. I tried tracking down the problem in the OpenCV source code but the matrix operations are a bit convoluted and I don't have a whole lot of time at the moment to rifle through it, so I'm wondering if anyone else has encountered this problem and knows a fix?

EDIT

I'll add some clarification.

First, I know doubles are not exact numeric representations. What I meant by "exact" (why it was in quotes) was the double division performed outright (e.g. printing the result of 95.0/235.0) is not exactly the same as what OpenCV does when you divide a matrix by a scalar, despite that the value in the matrix is indeed stored as a double, and the scalar is indeed treated as a double as well. One would expect the two results should be identical; namely, if I divide a double by another double, the result should be the same as an OpenCV double matrix divided by a double scalar.

I have also already tried explicitly casting all numeric constants as doubles in the code, with no success.

While it's true in this case the difference is relatively small (e^-16), I'm not sure how this might possibly compound to make larger and larger errors over time. That's one issue. The other is more of a minor annoyance, a misunderstanding why OpenCV isn't doing what one would intuitively expect it to. In the end it may not cause any issues, but if the odd behavior can be avoided I'd obviously prefer that, especially since it makes it unclear when a calculation does not match the expected outcome from the MATLAB results because of a calculation oddity or because of an actual algorithm implementation issue (which is what I assumed it was).

Hopefully that's more clear.

Your results are not "exact" because `double` is not exact. http://en.cppreference.com/w/cpp/types/numeric_limits/is_exact — Drew Dormann, Jan 05 '15 at 21:07
Floating point calculations are going to be inexact. That's the nature of binary floating point. http://stackoverflow.com/questions/2100490/floating-point-inaccuracy-examples — PaulMcKenzie, Jan 05 '15 at 21:08
Are you compiling with -fast-math or something along those lines? — Borgleader, Jan 05 '15 at 21:09
The two results coincide for 16 digits, which is double precision. Is the difference really relevant? – DrewDormann, PaulMcKenzie, well yes, they are not exact. The question is, why aren't they inexact the exactly same way? — A. Donda, Jan 05 '15 at 21:34
Have you tried explicitly casting the scalar literal `235.0`? It's possible that one is being interpreted as a float rather than double. — beaker, Jan 05 '15 at 21:34
@Borgleader Interestingly enough, if I compile with --fast-math the OpenCV result does indeed now match! Although I don't quite understand why the OpenCV result changed instead of the direct computation I preformed. It must affect an inline operation in OpenCV. Could you write up an answer for me to accept? This did seem to fix the problem. — Anthony, Jan 05 '15 at 22:20
Ah, I got mixed up. It was indeed NOT the OpenCV result that changed. That makes more sense. So my OpenCV library must be compiled with --fast-math. — Anthony, Jan 05 '15 at 22:49

score 0 · Answer 1 · answered Jan 05 '15 at 21:48

Floating point math is inherently not exact. On the x86 platform, double precision numbers can be computed using either the FPU (80 bits extended precision), or with the SSE/AVX vector unit (64 bits double precision). Where this computation is done depends on choice of compiler and various options passed to the compiler. Worse, if the compiler runs out of 80-bit registers, it "spills" the results to memory as a 64-bit result. It turns out that the vector unit is faster for most floating point operations even for scalars, so that is often preferred by the compiler, when allowed.

If the software is explicitly written to use SSE or AVX for maximum speed, then it would definitely be using the 64-bit version. This might be the case for OpenCV. OpenCV might even approximate the computation by computing a reciprocal first (1.0/235.0), then multiplying the result through every pixel, since this would be much faster.

Some things to try:

some_matrix.at<double>(0,0) * (1.0 / 235.0)

Also try changing your compiler flags to include -mfpmath=sse -msse2 to ensure your compiler knows that you have an SSE unit, and to use it for double precision.

Read here for a longer description of these effects: https://gcc.gnu.org/wiki/x87note

I updated my comment to clarify a bit. Floating point inexactness isn't the problem. Namely, it's not getting the same result for what should be two exactly same computations (double / double). The hardware vector units are an interesting idea though, I hadn't thought of that. Including the flags didn't help though. It's possible the OpenCV libraries were told to preform their computations differently, but I'm not sure how to determine how they're done. Maybe I just need to rebuild it myself so I know for sure... — Anthony, Jan 05 '15 at 22:11
Sure, the OpenCV libraries are built with their own set of compiler flags, and you'd want things to match. Try `-ffast-math`. — Peter, Jan 05 '15 at 22:48

score 0 · Accepted Answer · answered Mar 12 '15 at 13:03

As mentioned by @Borgleader in a comment, the problem was library compilation with the -fast-math compiler option in a computation library but not my application. In this case it wasn't OpenCV but rather another library in my distribution, but it was this discrepancy that caused the difference. The problem was fixed by rebuilding that library without the flag for consistent results.

OpenCV double matrix division by scalar produces incorrect result

2 Answers2