I have a data frame of groups for which I would like to calculate the Mahalanobis distance for each group.
I am trying to apply the Mahalanobis function to hundreds of groups and one particular group is causing an issue due to the small sample size (only two rows).
My data looks as follows:
foo <- data.frame(GRP = c("a","a","b","b","b"),X = c(1,1,15,12,50),
Y = c(2.17,12.44,50,70,100))
I have borrowed a function idea from here and it looks as follows:
auto.mahalanobis <- function(temp) {
mahalanobis(temp,
center = colMeans(temp, na.rm=T),
cov = cov(temp, use="pairwise.complete.obs"),
tol=1e-20,
inverted = FALSE
)
}
Based on a suggestion here I added the tol
argument to the auto.mahalanobis
function to avoid issues when calculating the covariance matrix with small numbers.
I then tried using this function with my data set and am getting the following error about singular matrices:
z <- foo %>% group_by(GRP) %>% mutate(mahal = auto.mahalanobis(data.frame(X,Y)))
Error: Problem with `mutate()` input `mahal`.
x Lapack routine dgesv: system is exactly singular: U[1,1] = 0
i Input `mahal` is `auto.mahalanobis(data.frame(X, Y))`.
i The error occurred in group 1: GRP = "a".
The same function works well with other groups that have a larger sample sizes, is there a suggested way to fix this issue or to skip such groups when the sample is too small?