1

I am working with the Auto data set in the ISLR library. How do I split the data into 75% train and 25% test. I think I split it right but I cant figure out howo to not include the two columns

Create a train and a test set

a. 75% train, 25% test

b. set seed to 1234 to get reproducible results

c. do not include columns “name” and “mpg” in the train and test sets

   splitTheData <- sample(nrow(Default), nrow(Default)*0.75, replace=FALSE)
   #3b
   set.seed(1234)

   #3c
theguy0994
  • 27
  • 5
  • I am not supposed to include those column – theguy0994 Jun 28 '17 at 17:07
  • a. `set.seed` before you `sample` for reproducibility. b. If you don't want those colums, subset them out first (though with the formula interface it usually doesn't matter if there are extra columns there). – alistaire Jun 28 '17 at 17:11
  • Is your question about splitting the data into training and test or about dropping columns? – shea Jun 28 '17 at 17:18
  • Both. I am suppose to split the data AND I dont have to include the name and mpg column – theguy0994 Jun 28 '17 at 17:25

0 Answers0