0

I am confused which apply family to use here. I have a data frame mydf as

terms
A
B
C

I want to apply custom function to each of the values and get results in new columns like below

terms Value1 Value2 ResultChar
A     23     45     Good
B     12     34     Average
C     9      23     Poor

custom function is something like myfunc("A") returns a vector like (23, 45, Good)

Any help will be appreciated.

Cyrus Mohammadian
  • 4,982
  • 6
  • 33
  • 62
Tarak
  • 1,035
  • 2
  • 8
  • 14
  • How did you get 'Value1' and 'Value2' in the expected when your input dataset have only a single column i.e. `terms`? – akrun Aug 18 '16 at 05:57
  • Its a complex function with many values pulled from other data sources...All I'm passing is the character A. You may take a simpler example with a simple formula – Tarak Aug 18 '16 at 06:00
  • You can loop with `lapply` ie. `lapply(mydf$terms, function(x) ...)` without more info, it is difficult to suggest – akrun Aug 18 '16 at 06:01
  • @akrun So which question should this be a duplicate of? http://stackoverflow.com/posts/39010839/timeline doesn't reveal the old dupe link any longer. Anyway, this might be a good time to flag for mod attention. – tripleee Aug 18 '16 at 07:58
  • 1
    @tripleee: [post history](https://stackoverflow.com/posts/39010839/revisions) does. – Martijn Pieters Aug 18 '16 at 07:59
  • @tripleee This question is not very clear to begin with. Secondly, it is more of a general question. So, I thought the link in my solution post fits well. – akrun Aug 18 '16 at 07:59
  • This question is clearly not asking why `rbindlist()` is better than `rbind()`, which is what the linked duplicate was showing. Therefore, I reopened it. – Rich Scriven Aug 18 '16 at 08:02
  • 1
    @DirtySockSniffer I would say the question was not all clear. If the OP wants to select which `apply` function to do, perhaps [this](http://stackoverflow.com/questions/3505701/r-grouping-functions-sapply-vs-lapply-vs-apply-vs-tapply-vs-by-vs-aggrega) might be the one – akrun Aug 18 '16 at 08:02

2 Answers2

4

Looks like you want a data frame output, as you have different data type across columns. So you need define your myfunc to return a data frame.

Consider this toy example:

mydf <- data.frame(terms = letters[1:3], stringsAsFactors = FALSE)
myfunc <- function (u) data.frame(terms = u, one = u, two = paste0(u,u))

Here is one possibility using basic R features:

do.call(rbind, lapply(mydf$terms, myfunc))
#  terms one two
#1     a   a  aa
#2     b   b  bb
#3     c   c  cc

Or you can use adply from plyr package:

library(plyr)
adply(mydf, 1, myfunc)
#  terms terms.1 two
#1     a       a  aa
#2     b       b  bb
#3     c       c  cc

(>_<) it is my first time trying something other than R base for a data frame; not sure why adply returns undesired column names here...

Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248
3

We can use rbindlist with lapply. It would be more efficient

 library(data.table)
 rbindlist(lapply(mydf$terms, myfunc))

If needed, I can show the benchmarks. But, they are already shown here

Community
  • 1
  • 1
akrun
  • 874,273
  • 37
  • 540
  • 662