-1

I have a dataset with 400K observations and 250 features. I would like to perform the stratified sampling.

I referred many links, but they are all after 1 or two variables examples including Target.

Can anybody please help me how should be performing stratified sampling using R / Python.

thanks in Adavance !

Adarsha Murthy
  • 145
  • 3
  • 13

1 Answers1

0

If you first group your data.frame, you can sample each group using dplyr's sample_n()

library(dplyr)
sample.df <- df %>% group_by( ID ) %>% sample_n( 10 )
Wimpel
  • 26,031
  • 1
  • 20
  • 37