0

I have a sliderinput in shinyapp, which defines how much of data to use in train/test sets of a classification problem. Right now I'm doing this to get a sample from data, where input$percentInTrain is a value between 0 and 1:

        testidx <- sample(1:nrow(dt), length(1:nrow(dt)*input$percentInTrain))
        rvtrain <- dt[testidx,]
        rvtest <- dt[-testidx,]

Question is: is there a better and less uglier way to do that?

vladli
  • 1,454
  • 2
  • 16
  • 40
  • 1
    see https://stackoverflow.com/questions/17200114/how-to-split-data-into-training-testing-sets-using-sample-function-in-r-program – timfaber Jun 30 '17 at 09:34
  • @timfaber thank you, I didnt find it with my search attempts. – vladli Jun 30 '17 at 09:38
  • I think your solution is actually pretty decent, I normally use createDataPartition in the `caret` which is pretty nice but it depends on splitting data given a certain class but this is something you don't use I think – timfaber Jun 30 '17 at 09:41
  • 1
    I'm slowly moving toward leaning the `caret` package, though doing everything without it for now. After reading suggested thread I came up with this piece of code, by the way: `smp_size <- floor(as.numeric(paste0("0.", input$percentInTrain)) * nrow(dt)); testidx <- sample(seq_len(nrow(dt)), size = smp_size)` – vladli Jun 30 '17 at 11:10

0 Answers0