4

I have a dataframe in the following form:

   dim1  dim2
1 Loc.1 0.325
2 Loc.2 0.325
3 Loc.3 0.321
4 Loc.4 0.256
5 Loc.5 0.255

I would like to compute the mean of each combination of two (2) elements within 'dim2'; and convert the output into a matrix; while keeping the information provided by 'dim1'.

For now, I can get pairwise means using the combn function:

combn(tab[,2],2, mean)
[1] 0.3250 0.3230 0.2905 0.2900 0.3230 0.2905 0.2900 0.2885 0.2880 0.2555

but I would like it to be displayed in a matrix-like form (which would actually be quite similar to an object of class 'dist', as I would like it to be for further analyses) like this:

        Loc.1   Loc.2   Loc.3   Loc.4
Loc.2   0.325           
Loc.3   0.323   0.323       
Loc.4   0.290   0.291   0.289   
Loc.5   0.290   0.290   0.288   0.256

(and I also need, as you may see, the information 'Loc.x')

I could not find a simple function that would directly compute pairwise calculation on my dataframe 'tab'. I could use a for loop but I feel like there should be a more straighforward way.

Any suggestion? Thank you very much!

Jaap
  • 81,064
  • 34
  • 182
  • 193
Chrys
  • 313
  • 3
  • 10

3 Answers3

4

Here is a relatively simple way to convert a vector to a distance matrix:

vec <- c(0.3250, 0.3230, 0.2905, 0.2900, 0.3230, 0.2905, 0.2900, 0.2885, 0.2880, 0.2555)

mat <- matrix(nrow = 5, ncol = 5)
mat[lower.tri(mat)] <- vec
mat <- as.dist(mat)

#output
> mat
       1      2      3      4
2 0.3250                     
3 0.3230 0.3230              
4 0.2905 0.2905 0.2885       
5 0.2900 0.2900 0.2880 0.2555
missuse
  • 19,056
  • 3
  • 25
  • 47
4

Here is a one-liner using expand.grid instead of combn.

as.dist(matrix(apply(expand.grid(tab[, 2], tab[, 2]), 1, mean), 5, 5))
#       1      2      3      4
#2 0.3250
#3 0.3230 0.3230
#4 0.2905 0.2905 0.2885
#5 0.2900 0.2900 0.2880 0.2555

The reason why this works is because expand.grid considers all possible combinations of the two column vectors tab[, 2], while combn misses the diagonal elements; we then operate row-wise on the combination matrix, calculate means, and cast the vector first as a matrix and then as a dist object.

Maurits Evers
  • 49,617
  • 4
  • 47
  • 68
2

You can also use the outer function.

dim2 <- as.numeric(tab$dim2)
names(dim2) <- tab$dim1
x <- outer(dim2, dim2, function(x,y) (x + y) / 2)
as.dist(x)
#        Loc.1  Loc.2  Loc.3  Loc.4
# Loc.2 0.3250                     
# Loc.3 0.3230 0.3230              
# Loc.4 0.2905 0.2905 0.2885       
# Loc.5 0.2900 0.2900 0.2880 0.2555
hpesoj626
  • 3,529
  • 1
  • 17
  • 25