Set-up to problem: I want to subset my 500-row dataframe into 10 subsets of 50 rows each. Then I want to use 9 of them as a training set and 1 as a test in such a way that each subset gets one "chance" to be the test data.
What I have done so far:
# Get my 10 equal subsets and convert each to a data matrix.
b <- seq(50, 500, 50)
subsets <- lapply(seq_along(b), function(i) trainN[(b-49)[i]:b[i], ])
subsetsDF <- lapply(subsets, data.matrix)
# subsetsDF is a list of 10 data matrices
What I cannot do: I do not know how to loop over the data matrix indices from i=1 to 10, using index i as test data and rbinding the rest and assigning it to "train data" My attempt so far:
function(data) {
n_subset <- 1:10
for(j in seq_along(n_subset)) {
test <- data[[j]]
#train <- do.call(rbind, data[[-j]]) of course this isn't right
}
}
Note: I welcome any suggestions that completely restructure my approach. As I reflect on what I'm trying to do, it occurs that this is not the best way. Nonetheless, the posed question still interests me. So I welcome both answers that strictly address my question as well as answers that offer a better approach. (For example, I know for loops are not efficient in R...I don't know how to do this with apply functions. After I complete this, I need to nest it within another function that runs the 10 different models resulting from the different test/train divisions on each of several values for k.)