Random Sample of rows from an R dataset

Question

Suppose I have a dataset with (90,000 x 17) i.e. (n x p) where n is the number of observations and p is the number of variables and I would like to take a random sample of 20% of rows from my whole dataset how can this be done in R?

After taking a random sample I will be performing cluster analysis accordingly.

I had tried using other questions to answer my question but they were inconclusive because it was not giving me what I needed.

Remember to fix seed `set.seed(1492)` (or any number) in order to obtain reproducibility of your sample! — LocoGris, Mar 05 '19 at 14:35

score 6 · Accepted Answer · edited Mar 05 '19 at 14:38

6

You can do it with sample_frac from dplyr, here is an example with the database iris

 library(dplyr)
 #data(iris)
 sample20 <- iris %>% sample_frac(0.2)

edited Mar 05 '19 at 14:38

NelsonGon

13,015
7
27
57

answered Mar 05 '19 at 14:35

Derek Corcoran

3,930
2
25
54

Random Sample of rows from an R dataset

1 Answers1