6

I want to return multiple results from a function on columns of data.frame and add these new columns to the same data.frame together with other simple calculation.

For a simplified example, if I want to get both integral value and absolute error of sin function together with the mid points of integral intervals:

df <- data.frame(Lower = c(1,2,3), Upper = c(2,3,4))
setDT(df)
getIntegral <- function(l, u) {
  n <- integrate(sin, mean(l), mean(u))
  list(Value=n$value, Error=n$abs.error)
}
df[,
   c('Value', 'Error', 'Mid') := {
     n <- getIntegral(Lower, Upper)
     list(n$Value,
          n$Error,
          (Lower+Upper)/2)
   }]
df
   Lower Upper     Value        Error Mid
1:     1     2 0.5738457 6.370967e-15 1.5
2:     2     3 0.5738457 6.370967e-15 2.5
3:     3     4 0.5738457 6.370967e-15 3.5

I don't quite like my approach because separating names of new columns and the values assigned to them makes it hard for me to read, how can I do this task better? It's part of a long data processing chain so I don't want create temp variable outside, so I would prefer solutions using data.table or dplyr alone.

user3684014
  • 1,175
  • 12
  • 26
  • Are you saying you don't like the data.table syntax?? – jlhoward Dec 29 '14 at 20:23
  • Do you mean something like this?? `setDT(df)[,":="(Value=getIntegral(Lower,Upper)$Value, Error=getIntegral(Lower,Upper)$Error, Mid =(Lower+Upper)/2)]` – jlhoward Dec 29 '14 at 20:32
  • Or perhaps this?? `setDT(df)[,c("Value","Rrror","Mid"):= with(getIntegral(Lower,Upper),list(Value,Error,(Lower+Upper)/2))]` – jlhoward Dec 29 '14 at 20:33
  • @jlhoward, I would like something like `setDT(df)[,":="(Value=getIntegral(Lower,Upper)$Value, Error=getIntegral(Lower,Upper)$Error, Mid =(Lower+Upper)/2)]`, but I don't want to run `getIntegral` twice. – user3684014 Dec 29 '14 at 20:44
  • Then do it the first way. Alternatively, you could change your function to return a named list with all three values (Value, Error, and Mid), and then just use `setDT(df)[,getIntegral(Lower,Upper),by=list(Lower,Upper)]` – jlhoward Dec 29 '14 at 20:46
  • @jlhoward, this is just a simplified example, in my real use case there are many combinations of such calculations, it's too much work to create new functions for every new combination of desired outputs. – user3684014 Dec 29 '14 at 21:02

1 Answers1

8

The RHS should be a list of values, and each element of the list gets converted to a column (and recycled if necessary).

Your function already returns a list (of length 1 each) and (Lower+Upper)/2 returns a vector of 3 values (here). In order to return a list, you can use the function c() as follows:

df[, c('Value', 'Error', 'Mid') := c(getIntegral(Lower, Upper), list((Lower+Upper)/2))]
#    Lower Upper     Value        Error Mid
# 1:     1     2 0.5738457 6.370967e-15 1.5
# 2:     2     3 0.5738457 6.370967e-15 2.5
# 3:     3     4 0.5738457 6.370967e-15 3.5

This makes use of the fact that c(list, list) results in a concatenated list.

Arun
  • 116,683
  • 26
  • 284
  • 387
  • As @jlhoward put in the question comment, is it possible that I can have something like `setDT(df)[,":="(Value=getIntegral(Lower,Upper)$Value, Error=getIntegral(Lower,Upper)$Error, Mid =(Lower+Upper)/2)]`, without running `getIntegral` twice? I would like the names of columns and the values assigned to them nearer because I have a long list of calculation. – user3684014 Dec 30 '14 at 01:08
  • No, I don't think that's possible. – Arun Dec 30 '14 at 06:25