4

I am very very new to programming and R. I have tried to find an answer to my question, but part of the problem is I don't know exactly what to search.

I am trying to repeat a calculation (statistical distance) for each row of a matrix. Here is what I have so far:

pollution1 <-as.matrix(pollution[,5:6])
ss <- var(pollution1)
ssinv <- solve(ss)
xbar <- colMeans(pollution1)
t(pollution1[1,]-xbar)%*%ssinv%*%(pollution1[1,]-xbar)

This gets me only the first statistical distance, but I don't want to retype this line with a different matrix row to get all of them.

From what I have read, I may need a loop or to use apply(), but haven't had success on my own. Any help with this, and advice on how to search for help so I don't need to post, would be appreciated. Thank you.

Ricardo Saporta
  • 54,400
  • 17
  • 144
  • 178
  • Welcome to SO! can you reproduce an example to help you? http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – agstudy Dec 01 '12 at 20:49

2 Answers2

3

You might also consider the mahalanobis function: from ?mahalanobis,

Returns the squared Mahalanobis distance of all rows in ‘x’ and the vector mu = ‘center’ with respect to Sigma = ‘cov’. This is (for vector ‘x’) defined as

                  D^2 = (x - mu)' Sigma^-1 (x - mu)

Of course, it's good to learn how to use apply too ...

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
2

What about just using apply

apply(pollution1, 1, function(i) t(i-xbar) %*% ssinv %*% (i-xbar))

Also, it's helpful if you make your example reproducible, for example:

pollution1 = matrix(rnorm(100), ncol=2)
ss = var(pollution1)
ssinv = solve(ss)
xbar = colMeans(pollution1)
t(pollution1[1,]-xbar) %*% ssinv %*% (pollution1[1,]-xbar)
csgillespie
  • 59,189
  • 14
  • 150
  • 185