2

I'm using the awesome Eigen3 library to write a MATLAB MEX file. But I am experiencing some accuracy issues (compared to MATLAB), even when using long double. The most critical computation seems to be the one where I compute a probability according to the normal distribution. Here is the code snippet:

p.fill( 1/( M_PIl * sigma * sigma ) );
p.array() *= ( - 0.5/pow( sigma, 2.0 ) * ( mu.array() - x.array() ).array().square() ).array().exp();

where x, p and mu are Eigen::Matrix< long double, Dynamic, 1 >. Usually these vectors have a length of 3000.

What are possible steps I can take to get the maximum possible precision? What are the correct GCC compiler flags I can use to force 80 bit precision wherever possible?

P.S: I compile the C++ code (in MATLAB with MEX) with gcc 4.9 and my linux reports the following available instruction sets: Intel MMX, Intel SSE, Intel SSE2, Intel SSE3, Intel SSE4

Edit: I tried what @Avi Ginsburg suggested below and compiled it using the following command:

mex -g -largeArrayDims '-I/usr/include/eigen3' CXXFLAGS='-DEIGEN_DONT_VECTORIZE -std=c++11 -fPIC' test.cpp

with double and long double and each of these options gives me the same error with respect to the solution from MATLAB.

bonanza
  • 280
  • 3
  • 11
  • gcc will use 80-bit extended precision for `long double` on x86, no need for a flag. If you want more precision and can sacrifice performance then you can use [gcc](http://stackoverflow.com/q/13516476/995714)'s [__float128](https://gcc.gnu.org/onlinedocs/gcc/Floating-Types.html) type – phuclv Sep 12 '16 at 08:12
  • 2
    `M_PI` does not have `long double` precision. With GNU extensions you can use [`M_PIl`](https://www.gnu.org/software/libc/manual/html_node/Mathematical-Constants.html#Mathematical-Constants), which has 128 bits of precision. That could be part of your problem. – mindriot Sep 12 '16 at 08:56
  • What type(s) are `p`, `x`, and `mu`? – Avi Ginsburg Sep 12 '16 at 09:23
  • @AviGinsburg, thanks for your comment! I just updated the question with the information that all these are vectors of kind `Eigen::Matrix< long double, Dynamic, 1 >`. – bonanza Sep 12 '16 at 10:02
  • Don't you need to suffix `l` or `L` for the literals? While they are probably upgraded to `long double` the precision may not be. – Avi Ginsburg Sep 12 '16 at 10:06
  • @AviGinsburg @mindriot I already tried it with `M_PIl` as also mindriot suggested but I didn't improve the error with respect to the MATLAB solution. I guess its because the separate line `p.fill( 1/( M_PIl * sigma * sigma ) );` is not that accuracy critical. – bonanza Sep 12 '16 at 10:10
  • @bonanza I meant `0.5L` and `2.0L`. Also, change `pow(sigma, 2.0L)` to `sigma*sigma`. – Avi Ginsburg Sep 12 '16 at 10:12
  • 4
    Would be better if you provided source data (or how to generate it) and example codes for both Matlab and C++ version. Otherwise all the answers can only be guesswork. Also, what type does `sigma` have? How do you calculate/initialize it? It may just have too poor precision value at the very first step. – Ruslan Sep 12 '16 at 10:29
  • I agree with @Ruslan. It's time for a [mcve]. – mindriot Sep 12 '16 at 11:20
  • Please, also quantify the error you are observing. – ggael Sep 12 '16 at 14:10

3 Answers3

2

I'm hazarding a guess here. You are using SSE instructions with your array calculations, most notably, ...array().exp(). I'm pretty sure there is no extended precision with SSE, hence the differences between MATLAB and Eigen.

Avi Ginsburg
  • 10,323
  • 3
  • 29
  • 56
  • Thanks! Is there a way (e.g. using GCC compiler flags) to force FPU usage for these (or all) operations? Will this help in getting extended (e.g. 80 bit) precision? – bonanza Sep 12 '16 at 08:20
  • 1
    No, Eigen was written with explicit vectorization (SSE) routines and doesn't rely on the compiler to recognize memory alignment and the vectorization opportunities. You can turn off Eigen's vectorization by defining `-DEIGEN_DONT_VECTORIZE`. – Avi Ginsburg Sep 12 '16 at 08:23
  • 1
    Just a comment, `sigma*sigma` will probably be faster than `pow( sigma, 2.0 )` (I know, insignificant compared to the rest of the line, but if sigma were an Eigen object, `sigma.square()` would be faster than `sigma.pow(2.0)` for the entire vector). – Avi Ginsburg Sep 12 '16 at 08:26
  • 1
    Does MATLAB use 80-bit precision? I doubt it very much. Most likely it uses native `double`, which is IEEE 754 `binary64` on x86. – Ruslan Sep 12 '16 at 10:33
  • I have no idea. Not my area of expertise. I assumed so based on the question but you have a very valid point. A quick search didn't yield any higher accuracy fp Matlab info. – Avi Ginsburg Sep 12 '16 at 10:42
  • @Ruslan I came up with [this](https://www.mathworks.com/matlabcentral/answers/94484-is-it-possible-to-use-extended-precision-for-decimal-number-using-64-bit-defined-by-ieee-754-standar). So natively, you are 100% correct, but arbitrary precision seems to be available as an add on. – Avi Ginsburg Sep 12 '16 at 10:50
1

By default Eigen uses a faster but slightly less accurate implementation of several mathematical functions, including the exponential. You can explicitly disable these optimizations by compiling with the -DEIGEN_FAST_MATH=0 option.

If you use gcc as your compiler also make sure that you don't use the -Ofast or -ffast-math options, as these can result in reduced precision.

Benoit Steiner
  • 1,484
  • 11
  • 11
1

If you want to compute the probability density of a (1 dimensional) normal distribution, the factor at the beginning should be 1/std::sqrt( 2* M_PIl * sigma * sigma ).

Also, the p.fill() at the beginning of your snippet is inefficient. Just write this in one line:

p = (1/std::sqrt(2*M_PIl*sigma*sigma)) *
      ( -0.5/(sigma*sigma) * (mu-x).array().square() ).exp();

N.B.: If you are only performing element-wise operations on your arrays, consider declaring them as Eigen::Array<...> instead of Eigen::Matrix<...>. The template parameters are the same, also the binary layout is the same, but you don't need to write .array() every time you want to make element-wise operations.

chtz
  • 17,329
  • 4
  • 26
  • 56