10

Suppose I have a data frame with columns c1, ..., cn, and a function f that takes in the columns of this data frame as arguments. How can I apply f to each row of the data frame to get a new data frame?

For example,

x = data.frame(letter=c('a','b','c'), number=c(1,2,3))
# x is
# letter | number
#      a | 1
#      b | 2
#      c | 3

f = function(letter, number) { paste(letter, number, sep='') }

# desired output is
# a1
# b2
# c3

How do I do this? I'm guessing it's something along the lines of {s,l,t}apply(x, f), but I can't figure it out.

Brett
  • 103
  • 1
  • 1
  • 4

3 Answers3

11

as @greg points out, paste() can do this. I suspect your example is a simplification of a more general problem. After struggling with this in the past, as illustrated in this previous question, I ended up using the plyr package for this type of thing. plyr does a LOT more, but for these things it's easy:

> require(plyr)
> adply(x, 1, function(x) f(x$letter, x$number))
  X1 V1
1  1 a1
2  2 b2
3  3 c3

you'll want to rename the output columns, I'm sure

So while I was typing this, @joshua showed an alternative method using ddply. The difference in my example is that adply treats the input data frame as an array. adply does not use the "group by" variable row that @joshua created. How he did it is exactly how I was doing it until Hadley tipped me to the adply() approach. In the aforementioned question.

Community
  • 1
  • 1
JD Long
  • 59,675
  • 58
  • 202
  • 294
  • 1
    You could simplify this with `transform` or `summarize`: `adply(x, 1, summarize, paste(letter, number, sep = ""))` – JoFrhwld Aug 13 '10 at 23:12
  • Awesome, thanks! Yep, my example was just a toy example. I looked at plyr+reshape a while ago, and didn't understand it =(, but I'll definitely have to take a look again. – Brett Aug 15 '10 at 22:59
  • @JoFrhwld, you are exactly right about simplifying. The example Hadley gave me does exactly that. I didn't want to simplify too much, however, as I wanted a general answer that could be applied to other things. – JD Long Aug 16 '10 at 15:01
7
paste(x$letter, x$number, sep = "")
Greg
  • 11,564
  • 5
  • 41
  • 27
  • This is the way I would have done it! It's almost like we've been mentored by the same R master ;-) – Vince Aug 14 '10 at 01:59
1

I think you were thinking of something like this, but note that the apply family of functions do not return data.frames. They will also attempt to coerce your data.frame to a matrix before applying the function.

apply(x,1,function(x) paste(x,collapse=""))

So you may be more interested in ddply from the plyr package.

> x$row <- 1:NROW(x)
> ddply(x, "row", function(df) paste(df[[1]],df[[2]],sep=""))
  row V1
1   1 a1
2   2 b2
3   3 c3
Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418