I'm new to R. I have a large file with multiple columns and I've been asked to split the data into 2 parts. I have R split the data randomly by 70% into a group called nTrain
, and 30% into a group called nTest
.
I was able to split the data randomly, but I now need to calculate the AVERAGE of a specific column in the 70% random data and do the same for the 30% random data. Can someone please explain how to do so?
Thanks.
If it helps understand my situation, this is what I have so far in R:
length(DataFile)
(nData=nrow(DataFile))
DataFile
set.seed(0)
(trainIdx<- sample(seq(1,nrow(DataFile)), floor(nrow(DataFile)*0.70)))
> (nTrain=length(trainIdx))
[1] 15129
> (nTest=nData-nTrain)
[1] 6484