2

I am training a bagFDA model using train() function in r caret package, and save the model output as a .Rdata file. the input file is about 300k records with 26 variables, but the output .Rdata has a size of 3G. I simply run the following:

modelout <- train(x,y,method="bagFDA")
save(file= "myout.Rdata", modelout)

under a window system. question: (1) why myout.Rdata is so big? (2) how can I reduce the size of the file?

Thanks in advance!

JT

kangaroo_cliff
  • 6,067
  • 3
  • 29
  • 42
StatIsFun
  • 23
  • 4

1 Answers1

4

In the trainControl set returnData = FALSE for starters, so your not creating an extra copy of the data in the model. My understanding is the with bagFDA you are creating a number of bootstraps, which essentially create the same number of copies of your data. lowering the B parameter, defaulted to 50, should shrink it as well Also, check out this post:

Why is caret train taking up so much memory?

kangaroo_cliff
  • 6,067
  • 3
  • 29
  • 42
Matthew
  • 41
  • 1