How can I implement stratified sampling in a randomForest regression in R? I know that the strata and sampsize parameters are used in randomForest classification problems, but I get Error in { : task 1 failed - "sampsize should be of length one."
My data:
x <- sample(1:10, 100, replace = TRUE)
y <- sample(1:20, 100, replace = TRUE)
Region <- sample(c('N', 'S'), 100, replace = TRUE)
df <- data.frame(x, y, Region)
My code:
randomForest(x ~ y, data = df, sampsize = c(30,20), strata = df$Region)
My actual analysis has far worse imbalance between groups than even this. Thank you.