I want to get a Mahalanobis difference for each set of two scores, after being grouped by another variable. In this case, it would be a Mahalanobis difference for each Attribute (across each set of 2 scores). The output should be 3 Mahalanobis distances (one for A, B and C).
Currently I am working with (in my original dataframe, there are some NAs, hence I include one in the reprex):
library(tidyverse)
library(purrr)
df <- tibble(Attribute = unlist(map(LETTERS[1:3], rep, 5)),
Score1 = c(runif(7), NA, runif(7)),
Score2 = runif(15))
mah_db <- df %>%
dplyr::group_by(Attribute) %>%
dplyr::summarise(MAH = mahalanobis(Score1:Score2,
center = base::colMeans(Score1:Score2),
cov(Score1:Score2, use = "pairwise.complete.obs")))
This raises the error:
Caused by error in
base::colMeans()
: ! 'x' must be an array of at least two dimensions
But as far as I can tell, I am giving colMeans two columns.
So what's going wrong here? And I wonder if even fixing this gives a complete solution?