R - choose a sample from a data set from each group of values

Question

I have a data set that looks like this:

string_1,score,group
"sdfsd",0.546,0.5
"sdfsd",0.53,0.5
"sdfsd",0.52,0.5
"dgfbx",0.43,0.4
"dsgfgsd",0.48,0.4
"dsgfgsd",0.42,0.4
"dsgfgsd",0.84,0.8
"dsgfgsd",0.83,0.8
"dsgfgsd",0.82,0.8

And I want to take a sample from each group. meaning - I want to take random 2 rows from each group of values: 0.4,0.5,0.8 (group fields)

What is the simplest way to do it?

Thanks

With `library(dplyr)` that would be `dat %>% group_by(group) %>% sample_n(2)` — talat, Nov 09 '15 at 12:22

score 2 · Accepted Answer · answered Nov 09 '15 at 12:03

You could consider doing something like this. It splits your data by groups, and returns sampled rows.

set.seed(1)
res <- do.call(rbind,lapply(split(dat,dat$group),function(x){x[sample(nrow(x),2),]}))
> res
      string_1 score group
0.4.4    dgfbx  0.43   0.4
0.4.6  dsgfgsd  0.42   0.4
0.5.2    sdfsd  0.53   0.5
0.5.3    sdfsd  0.52   0.5
0.8.7  dsgfgsd  0.84   0.8
0.8.8  dsgfgsd  0.83   0.8

R - choose a sample from a data set from each group of values

1 Answers1