I have some data.table from which I want to select a random subset, but only for some operations.
Suppose the data is
dat <- data.table(id=1:100, group=sample(1:20,100, replace=TRUE), a=runif(100), b=rnorm(100))
and I want to do two things:
- count the number of ids per group
- select from each group one id at random and record its value on
a
andb
I could follow How do you extract a few random rows from a data.table on the fly and choose
dat[n=.N, a=a[sample(.N,1)], b=b[sample(.N,1)], group]
but I am afraid, this will select a
and b
independently from one another. Is there a way of selecting the same?