0

I would like to use = in j to create multiple variables from a given function

As an example, from the data.table

DT=data.table(id=sample(10,100),v=runif(100))

I would like to create

# id  p20  p80
#  1 0.12 0.92
#  2 0.02 0.83

For now, I do

DT[,list(p20=quantile(v,0.20),p80=quantile(v,0.80),by=id]
eddi
  • 49,088
  • 6
  • 104
  • 155
Matthew
  • 2,628
  • 1
  • 20
  • 35

2 Answers2

1

I am not aware if what you are asking can be done directly. Here is an alternative - write a function around it- ( this is my quick & dirty code to give you an idea)

DT <- data.table(id=sample(10,100,replace=T),v=runif(100))

test <- function(pctls) {
  str <- "mean=mean(v)"
  for(i in pctls) {
     str<- paste(str,",p",i,"=quantile(v,",i/100,")",sep="")
    }
  str <- paste("DT[,list(",str,"), by=id]",sep ="")
  eval(parse(text=str))
}


test(c(10,20,30))
 id      mean        p10        p20        p30
 1:  1 0.3654006 0.04174424 0.05564887 0.13246705
 2:  2 0.3593194 0.07331625 0.09034995 0.09058092   
 3:  3 0.5071105 0.23652298 0.38699917 0.46832168
 4:  4 0.4399384 0.01624399 0.21743962 0.30668150
 5:  5 0.7163586 0.42516997 0.55865925 0.61741287
 6:  6 0.4773865 0.21349738 0.29869525 0.35726233
 7:  7 0.4433606 0.06423671 0.09839058 0.24951293
 8:  8 0.5774145 0.09875137 0.17900887 0.44749030
 9:  9 0.3980907 0.08683772 0.10629176 0.13377076
10: 10 0.5075917 0.18238568 0.28410222 0.39008093
6th
  • 43
  • 5
  • Thanks! However, one of the reason I'm looking for a syntax like this is that calling once `quantile(v,c(0.2,0.8))` must be more efficient than calling successively `quantile(v,0.2)` and then `quantile(v,0.8)`, no? – Matthew Sep 12 '14 at 19:07
  • You are probably correct but there is surely an expense for asking for more quantiles at once - ie system.time(quantile(v,c(0.2,0.8)) is more than system.time(quantile(v,c(0.2)) dat <- rpois(1e7, 10) system.time(quantile(dat, c(0.2,0.8)))[3] -system.time(quantile(dat, c(0.2)))[3] -system.time(quantile(dat, c(0.2)))[3] – 6th Sep 14 '14 at 02:56
1

You can try "listify" results. Something like:

DT[, as.list(quantile(v, c(0.2, 0.8))), by = id] 

If want to rename the new columns:

setnames(DT, c("id", paste0("q", c(0.2, 0.8)))
Matthew
  • 2,628
  • 1
  • 20
  • 35