Mahalanobis distance - different results on different machines in R

Question

I am working on finding out outliers using Mahalanobis distance in R. I have a dataset with 30 rows and 24 columns, which I feed into the mahanalobis function from stats package.I want to create find distance of each vector with rest of the rows. The results look good till I export the same input data and same code to another machine and rerun the code, which gives different results than the one seen on machine1. Is this expected behaviour ? or am I missing something. Please advice.

Code I used:

m_dist <- mahalanobis(data[, 2:25], colMeans(data[, 2:25]), cov(data[,2:25]),tol=1e-20)

Then I used boxplot on m_dist to identify the outliers. The result on first machine doesnt match the same on second. I even used set.seed(1007) on both machines just to check, but results are still different

I found another thread which discusses the result difference in python, but it doesnt help me in anyway...

`mahalanobis` uses `solve` internally. Different systems (including different results from `La_library()`) can give different results. — Roland, Oct 22 '18 at 08:33
Also, I would not set the tolerance lower than the default without looking into the underlying maths in more detail. — Roland, Oct 22 '18 at 08:36
Ohk... is there any resolution for this? is there a way I can get same result? — Varun kadekar, Oct 22 '18 at 09:07
I suspect you can get the same result if you increase the tolerance and ensure that the same LAPACK implementation is used. — Roland, Oct 22 '18 at 09:20
I could confirm your findings. For me, using Microsoft R Open, which uses Intel MKL as LA foundation, produced slightly different results wrt stock R from R project — Severin Pappadeux, Oct 22 '18 at 15:23
Ohk thanks. Then wouldnt it be unsafe to productionize this approach? Its risky to put the code in production that can give different results on different machines. Whats your opinion? — Varun kadekar, Oct 23 '18 at 08:39

Mahalanobis distance - different results on different machines in R

0 Answers0