0

Given a data frame in the form:

group          val
A              10
A              1
B              9
C              6
...

I'd like to sample the val in each group randomly, with a new data frame as result. The problem is that the number of val in each group is different, so I can't use sample() directly. Now I'd like to determine the sample size in a if-else condition: if the number of val is higher than, let's say, 3, then three vals are sampled. Otherwise all the val are taken as samples. How can I do that? Thank you in advance!

user5779223
  • 1,460
  • 3
  • 21
  • 42

1 Answers1

2

We can use data.table. Convert the 'data.frame' to 'data.table' (setDT(df)), grouped by 'group', we get the sample of 'val'

library(data.table)
setDT(df)[, .(val=sample(val)), by = group]

If we need to add a condition such that if the nrow is greater than 3, sample 3 values or else all the values.

setDT(df)[, if(.N >3 ) sample(val, 3, replace=FALSE) else sample(val), by = group]
akrun
  • 874,273
  • 37
  • 540
  • 662