1

I've been looking at the help page for tapply and by and I'm not sure if they are the right tool for this. For example, if I have a dataframe where the columns are Name,Value1,Value2 and I want to apply a function, say function f(x,y) { do_something } to Value1 and Value2 grouped by Name and get as a result a dataframe with the columns Name,f(Value1,Value2) how should I go about that?

I can get tapply to work in a simple case like this:

tapply(df$Name, df$value1, mean)

but what if my function takes as input df$value2 as well? and is not as simple as mean? In other words, pseudo-notation for what I'm trying to do would be:

tapply(df$Name, c(df$value1,df$value2), function f(x,y) { x+y+bla...})

plannapus
  • 18,529
  • 4
  • 72
  • 94
Palace Chan
  • 8,845
  • 11
  • 41
  • 93
  • 1
    Maybe you can make your example more concrete...? I would suggest looking into the data.table package here, otherwise I guess you'd have some combination of `by` with `mapply` – Frank Sep 17 '13 at 21:30
  • Added something to make it more concrete – Palace Chan Sep 17 '13 at 21:32

2 Answers2

4

by will do the job although it will not return a data.frame.

by(df, df$Name, function(X) f(X$Value1, X$Value2))

The package data.table is better set up for this sort of thing:

install.packages("data.table")
library(data.table)
dt = data.table(df)
dt[,f(Value1, Value2),by=Name]

Will return exactly what you're looking for.

Señor O
  • 17,049
  • 2
  • 45
  • 47
3

Also check out plyr. For example

require(plyr)
ddply(mtcars, .variables="cyl", .fun=mutate,
      meaningless_number = mean(mpg) + disp)

will give you back a data frame just like mtcars, with the added column meaningless_number which is the mean mpg by cyl plus the individual disp. Use .fun = mutate to add columns, .fun = summarize to see summaries, and other functions for other purposes.

The answers to this question are very good for general *apply knowledge. I also found this answer to be a great plyr tutorial.

Community
  • 1
  • 1
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294