0

Consider X as a matrix or data.frame. Is there anything like apply(X,1,FUN) which, instead of applying FUN to each row in X, gets some form like myApply(X,INDEX,FUN) where INDEX is a factor or indexing vector so that for each unique value IDX in INDEX, FUN is applied to X[INDEX==IDX,] and return some values?

Thanks

AKG
  • 45
  • 6
  • Please make it easier to help you by providing a [**minimal, reproducible example**](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610) with some toy data, the desired output and the code you have tried. Thanks! – Henrik Apr 10 '14 at 14:09
  • I guess you're looking for `?aggregate` and/or `?ave`? – alexis_laz Apr 10 '14 at 14:19

1 Answers1

0

This can be done by splitting your data frame with split and operating on each subset with the lapply function. Let's assume you wanted to compute the maximum ratio of sepal length to petal length for each species in the iris dataset. You could do this with:

data(iris)
unlist(lapply(split(iris, iris$Species),
              function(x) max(x$Sepal.Length / x$Petal.Length)))
#     setosa versicolor  virginica 
#   4.833333   1.700000   1.352941 

If you wanted to return multiple values for each subset, you can slightly modify the approach:

do.call(rbind, lapply(split(iris, iris$Species),
                      function(x) data.frame(max.ratio = max(x$Sepal.Length / x$Petal.Length),
                                             min.ratio = min(x$Sepal.Length / x$Petal.Length))))
#            max.ratio min.ratio
# setosa      4.833333  2.526316
# versicolor  1.700000  1.176471
# virginica   1.352941  1.050000

This approach is called split-apply-combine.

josliber
  • 43,891
  • 12
  • 98
  • 133