I'm struggling to use data.table to summarize results of vector functions, something that's easy in ddply.
Issue 1: aggregate with an (expensive) function with vector output
dt <- data.table(x=1:20,y=rep(c("a","b"),each=10))
This ddply command produces what I want:
ddply(dt,~y,function(dtbit) quantile(dtbit$x))
This data table command does not do what I want:
dt[,quantile(x),by=list(y)]
I can hack at data.table like so:
dt[,list("0%"=quantile(x,0),"25%"=quantile(x,0.25),
"50%"=quantile(x,0.5)),by=list(y)]
But that verbose, and also would be slow if the vector function "quantile" were slow.
A similar example is:
dt$z <- rep(sqrt(1:10),2)
ddply(dt,~y,function(dtbit) coef(lm(z~x,dtbit)))
Issue 2: Using a function with both vector input and output
xzsummary <- function(dtbit) t(summary(dtbit[,"x"]-dtbit[,"z"]))
ddply(dt,~y,xzsummary )
Can I do that kind of thing easily in data.table?
Apologies if these questions are already prominently answered.
This is a similar, not identical, issue to: data.table aggregations that return vectors, such as scale()