0

I am trying to smooth a matrix by attributing the mean value of a window covering n columns around a given column. I've managed to do it but I'd like to see how would be 'the R way' of doing it as I am making use of for loops. Is there a way to get this using apply or some function of the same family?

Example:

# create a toy matrix
mat <- matrix(ncol=200);
for(i in 1:100){ mat <- rbind(mat,sample(1:200, 200) )}
# quick visualization
image(t(mat))

enter image description here

This is the matrix before smoothing:

I wrote the function smooth_mat that takes a matrix and the length of the smoothing kernel:

smooth_row_mat <- function(k, k.d=5){
    k.range <- (k.d + 2):(ncol(k) - k.d - 1)
    k.smooth <- matrix(nrow=nrow(k))
    for( i in k.range){
            if (i  %% 10 == 0)      cat('\r',round(i/length(k.range), 2))
            k.smooth <- cbind( k.smooth, rowMeans(k[,c(  (i-1-k.d):(i-1) ,i, (i+1):(i + 1 - k.d) )]) )
    }
    return(k.smooth)

}

Now we use smooth_row_mat() with mat

mat.smooth <- smooth_mat(mat)

And we have successfully smoothed, on a row basis, the content of the matrix.

This is the matrix after: enter image description here

This method is good for such a small matrix although my real matrices are around 40,000 x 400, still works but I'd like to improve my R skills.

Thanks!

pedrosaurio
  • 4,708
  • 11
  • 39
  • 53

2 Answers2

2

Here's how I'd do it, with the raster package.

First, create a matrix filled with random data and coerce it to a raster object.

library(raster)
r <- raster(matrix(sample(200, 200*200, replace=TRUE), nc=200))
plot(r)

enter image description here

Then use the focal function to calculate a neighbourhood mean for a neighbourhood of n cells either side of the focal cell. The values in the matrix of weights you provide to the focal function determine how much the value of each cell contributes to the focal summary. For a mean, we say we want each cell to contribute 1/n, so we fill a matrix of n columns, with values 1/n. Note that n must be an odd number, and the cell in the centre of the matrix is considered the focal cell.

n <- 3
smooth_r <- focal(r, matrix(1/n, nc=n))
plot(smooth_r)

enter image description here

jbaums
  • 27,115
  • 5
  • 79
  • 119
  • 1
    Thanks, this is very useful. I chose the other answer as correct as it doesn't make use of external libraries. – pedrosaurio Apr 28 '14 at 12:01
2

You can apply a filter (running mean) across each row of your matrix as follows:

apply(k, 1, filter, rep(1/k.d, k.d))
Gavin Kelly
  • 2,374
  • 1
  • 10
  • 13
  • Why would this work? It is just applying the same filter to the whole row and not based on the actual content of the row. Am I not getting something? – pedrosaurio Apr 28 '14 at 11:31
  • I understand now what filter is doing. I made the mistake of applying it to time series of the same number repeated many times, therefore the lack of differences. – pedrosaurio Apr 28 '14 at 11:39
  • 1
    It's a filter in the time-series sense of the word - `apply` will take the contents of the row, and effectively call `filter(contentsOfRow, rep(1/k.d, k.d))` and `filter` takes every window of contentsOfRow (of same length as it's second argument), and produces a weighted sum of the elements in that window - I've given it equal weights, so it will be a windowed mean. – Gavin Kelly Apr 28 '14 at 11:39
  • I understand now how `filter` is working although it gives weird results if used with apply. In detail, it is transposing the matrix. The filtered row gets transformed into column. Why does this happen? – pedrosaurio Apr 28 '14 at 15:23
  • 1
    From the help page: If each call to 'FUN' returns a vector of length 'n', then 'apply' returns an array of dimension 'c(n, dim(X)[MARGIN])' - so just wrap it in a `t()` if you want an apply-based result - or wrap the `filter` in a loop if you'd rather stick with a loop-based method. – Gavin Kelly Apr 28 '14 at 15:47
  • Ahh I was lost reading about `filter` and not `apply`. Thanks my friend. – pedrosaurio Apr 28 '14 at 16:03