0

I have a dataframe from which I want to draw a random sample--not just any sample but one that contains exactly one randomly sampled row from each of the unique values in the column word:

set.seed(123)
df <- data.frame(
  word = sample(LETTERS[1:5], 50, replace = T),
  value = sample(1:10, 50, replace = T)
)
head(df)
  word value
1    B     1
2    D     5
3    C     8
4    E     2
5    E     6
6    A     3

What I've done to solve this problem is this: 1. Store unique words in vector:

UniqueWords <- unique(df$word)

2. Set up a for loop:

for(i in UniqueWords){
  df_sample[i,] <- df[sample(1:nrow(df[df$word==UniqueWords[i], ]), 1), ]
}

The loop, however, does not produce the correct result. How can it be tweaked or, alternatively, what other method can be used?

Chris Ruehlemann
  • 20,321
  • 4
  • 12
  • 34

0 Answers0