Suppose I have a dataset with (90,000 x 17)
i.e. (n x p)
where n
is the number of observations
and p
is the number of variables
and I would like to take a random sample of 20%
of rows from my whole dataset how can this be done in R?
After taking a random sample I will be performing cluster analysis accordingly.
I had tried using other questions to answer my question but they were inconclusive because it was not giving me what I needed.