7

I can't figure out how to do the following, crating a dynamic number of columns from a column of lists with data.table

set.seed(123); N=1e5
DT = data.table(x=rnorm(N), y=sample(c('a','b','c'),N,T))
probs = seq(.1,1,.1); newCols <- paste("q",100*probs,sep="");

DT2 <- DT[ ,list(Q=list(quantile(x,probs=probs))),by=y]
DT2
#   y                                                                          Q
#1: b -1.2817037351734,-0.840293441466144,-0.525195748246148,-0.259574774974136,
#2: c -1.26975023312311,-0.832359658553173,-0.513320691339448,-0.247863323660894,
#3: a -1.28189935066568,-0.838918942382995,-0.522409189372727,-0.257356179072232,

#Here I want to create 10 columns from Q called q10, q20...
DT2[ , newCols:=Q] #can't make this work because it is evaluated in the wrong environment I guess
statquant
  • 13,672
  • 21
  • 91
  • 162

1 Answers1

13

Try this:

DT2 <- DT[ , as.list(quantile(x,probs=probs)),by=y]
setnames(DT2, c("y", paste0("q", seq(10, 100, by=10))))

#    y       q10        q20        q30        q40          q50       q60       q70       q80
# 1: b -1.281704 -0.8402934 -0.5251957 -0.2595748 -0.001625739 0.2526686 0.5251940 0.8379979
# 2: c -1.269750 -0.8323597 -0.5133207 -0.2478633  0.003413041 0.2598378 0.5353759 0.8477539
# 3: a -1.281899 -0.8389189 -0.5224092 -0.2573562  0.001186281 0.2542550 0.5244238 0.8401411
#         q90     q100
# 1: 1.284773 3.856234
# 2: 1.283465 4.322815
# 3: 1.273615 3.921410
Arun
  • 116,683
  • 26
  • 284
  • 387
  • Nicely done, I can't find another post mentioning this trick, though I have the inpression it is vanilla... What is strange is that if I create a named-vector of NA with the names being newCols then `DT2[, names(MyNAVec):=Q]` would work – statquant Apr 22 '13 at 14:59
  • @statquant, there are quite some recent posts actually (probably not easy to find as it doesn't have an obvious title). See [**this**](http://stackoverflow.com/a/15510828/559784) and [**this**](http://stackoverflow.com/questions/6902087/proper-fastest-way-to-reshape-a-data-table/15512437#15512437). – Arun Apr 22 '13 at 15:02
  • @statquant, please check the edited solution. I've modified it to be *faster*. The previous solution is inefficient under lot of groups (because the names are created for each group). – Arun Apr 22 '13 at 15:03
  • Ok, I see, I was missing the `as.list`, though I am surprise that even with `with=F` we cannot do it in one line – statquant Apr 22 '13 at 15:11
  • 1
    You can wrap the `as.list` with `setattr(as.list(...), 'names', )`. But it's inefficient. – Arun Apr 22 '13 at 15:16
  • @Arun this is a very old post so I was curious whether a better solution now exists. Is there a one-liner alternative that is superior to or more efficient than the `setattr()` solution? – qdread Jan 04 '22 at 13:57