1

I am basic level in R. It's said better to rewrite loops by apply. But for the following problem, I've no idea how to achieve that. Can anyone help? Or recommend some similar examples?

    data(iris)  ## iris is a dataframe
    n <- ncol( iris )
    for ( i in 1: (n-1) ) 
   {
      subSet <- iris[, c(i, n)] ## extract the ith column and last column for analysis
      result <- someFunction( subSet ) ## analyze on the subset
      score[i] <- result$score
  splitVal[i] <- result$splitVal
   }
Thell
  • 5,883
  • 31
  • 55
yliueagle
  • 1,191
  • 1
  • 7
  • 22
  • 3
    What you need to avoid is growing any vector or matrix inside a loop. This is easily done. The apply functions are better from the point of view of being more compact and (usually) easier to understand and debug, but may actually be a little slower than a well written loop. Real speedup comes from actual vectorization, where possible. – Glen_b Apr 22 '14 at 08:59

3 Answers3

1

You can easily do this with sapply:

data(iris)
someFunction <- function(x) {
  list(score = mean(x[,1]),
       splitVal = median(x[,1]))
}
n <- ncol( iris )
sapply(1:(n-1), function(i, dataset, n){
  subSet <- dataset[, c(i, n)] ## extract the ith column and last column for analysis
  result <- someFunction( subSet ) ## analyze on the subset
  c(score = result$score, 
        splitVal = result$splitVal)
}, dataset = iris, n=n)

It will return the results:

             [,1]     [,2]  [,3]     [,4]
score    5.843333 3.057333 3.758 1.199333
splitVal 5.800000 3.000000 4.350 1.300000

It may though be worth to do the same with apply as this makes it easy to switch to parallel programming by using lapply

Max Gordon
  • 5,367
  • 2
  • 44
  • 70
0

Try with

apply(iris[,-n], 2, someFunction, paramFUN=iris[,n])

Where someFunction should have a main argument for the column iris[,i] and a secondary argument (which I represent as paramFUN) for the column iris[,n].

If that is not the case and the main argument must be a data.frame with two columns, you can use a loophole.

bindandfun <- function(x, y){   
  auxdf <- cbind(x,y) 
  #   
  res <- someFunction(auxdf)   
  return(res) 
}

apply(iris[,-n], 2, bindandfun, y=iris[,n])

just be careful with the classes. auxdf will be a matrix. I you need to, you can add some lines where I've marked the # to change the class of auxdf and to transform the second column into a factor.

Rufo
  • 524
  • 1
  • 3
  • 18
0

Many time reader, first time answerer !

Why not redefine newFn to have input one column of dataset with the second column being given in the fn. So x would be data[,j] and

newFn = function(x){

      use = cbind(x,data[,n]) # create your 2-columned matrix
      answer = SomeFn(use)   # now apply the fn you created
}

then

sapply(data[,-n], newFn) 
meh
  • 218
  • 1
  • 9