0

I have a list of words and their respective lengths. I used adist() to generate a matrix of similarities between the words. Now I wanna divide the similarities by the lengths, so I need to create a matrix of "mean lengths" between each pair of words. How do I do it with apply()? In Excel I would put the list of lengths in the first column and paste it transposed in the first line. Then each cell would be calculated to be the mean between the respective values (first row and first col). But I haven't found how to address the items like that inside apply(). Any thoughts, please? Thanks in advance!

A reproducible example:

lengths <- round(rnorm(10,20,10)) #suppose those are the words' lengths
  [1] 30 25 11  5 24 26 10 16 16  9

m <- matrix(ncol=11,nrow=11)
m[1,2:11] <- lengths
m[2:11,1] <- lengths
m
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
 [1,]   NA   30   25   11    5   24   26   10   16    16     9
 [2,]   30   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA
 [3,]   25   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA
 [4,]   11   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA
 [5,]    5   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA
 [6,]   24   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA
 [7,]   26   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA
 [8,]   10   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA
 [9,]   16   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA
[10,]   16   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA
[11,]    9   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA

m[2,2] <- (m[1,2]+m[2,1])/2
m[2,3] <- (m[1,3]+m[2,1])/2
m[2,4] <- (m[1,4]+m[2,1])/2
m

      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
 [1,]   NA   30 25.0 11.0    5   24   26   10   16    16     9
 [2,]   30   30 27.5 20.5   NA   NA   NA   NA   NA    NA    NA
 [3,]   25   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA
 [4,]   11   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA
 [5,]    5   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA
 [6,]   24   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA
 [7,]   26   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA
 [8,]   10   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA
 [9,]   16   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA
[10,]   16   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA
[11,]    9   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA

I need an apply function to to this to the whole matrix. But don't know how to index it.

Rodrigo
  • 4,706
  • 6
  • 51
  • 94

2 Answers2

3

you just need some matrix calculation. I am going to get rid of the nans in your matrix as well (first col and first row). You don't need them, do you?

> v<-c(30 ,  25 ,  11  ,  5  , 24 ,  26 ,  10  , 16   , 16   ,  9)
> m<-matrix(0,ncol=10, nrow=10)
> (t(m+v)+(m+v))/2
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,] 30.0 27.5 20.5 17.5 27.0 28.0 20.0 23.0 23.0  19.5
 [2,] 27.5 25.0 18.0 15.0 24.5 25.5 17.5 20.5 20.5  17.0
 [3,] 20.5 18.0 11.0  8.0 17.5 18.5 10.5 13.5 13.5  10.0
 [4,] 17.5 15.0  8.0  5.0 14.5 15.5  7.5 10.5 10.5   7.0
 [5,] 27.0 24.5 17.5 14.5 24.0 25.0 17.0 20.0 20.0  16.5
 [6,] 28.0 25.5 18.5 15.5 25.0 26.0 18.0 21.0 21.0  17.5
 [7,] 20.0 17.5 10.5  7.5 17.0 18.0 10.0 13.0 13.0   9.5
 [8,] 23.0 20.5 13.5 10.5 20.0 21.0 13.0 16.0 16.0  12.5
 [9,] 23.0 20.5 13.5 10.5 20.0 21.0 13.0 16.0 16.0  12.5
[10,] 19.5 17.0 10.0  7.0 16.5 17.5  9.5 12.5 12.5   9.0
CT Zhu
  • 52,648
  • 17
  • 120
  • 133
  • Very ingenious! It works perfectly. Thanks a lot! But the question remains on how to use apply() with two indexes... – Rodrigo Nov 19 '13 at 16:25
  • 1
    Sure, that can be done by `sapply(1:10, function(i) (v+v[i])/2)`. Give you the same matrix. Same as `apply(matrix(1:10), 1, function(i) (v+v[i])/2)`, only reads better. – CT Zhu Nov 19 '13 at 16:39
2

If you want to use apply:

lengths = lengths = c(30, 25, 11, 5, 24, 26, 10, 16, 16, 9)
sapply(lengths , function(x){(x + t(lengths ))/2})

Output:

     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,] 30.0 27.5 20.5 17.5 27.0 28.0 20.0 23.0 23.0  19.5
 [2,] 27.5 25.0 18.0 15.0 24.5 25.5 17.5 20.5 20.5  17.0
 [3,] 20.5 18.0 11.0  8.0 17.5 18.5 10.5 13.5 13.5  10.0
 [4,] 17.5 15.0  8.0  5.0 14.5 15.5  7.5 10.5 10.5   7.0
 [5,] 27.0 24.5 17.5 14.5 24.0 25.0 17.0 20.0 20.0  16.5
 [6,] 28.0 25.5 18.5 15.5 25.0 26.0 18.0 21.0 21.0  17.5
 [7,] 20.0 17.5 10.5  7.5 17.0 18.0 10.0 13.0 13.0   9.5
 [8,] 23.0 20.5 13.5 10.5 20.0 21.0 13.0 16.0 16.0  12.5
 [9,] 23.0 20.5 13.5 10.5 20.0 21.0 13.0 16.0 16.0  12.5
[10,] 19.5 17.0 10.0  7.0 16.5 17.5  9.5 12.5 12.5   9.0
Christian
  • 25,249
  • 40
  • 134
  • 225