1

I am new to machine learning and R. I want to run a statistical model to predict daily hours of supply of electricity (y). I have several x variables to use for prediction. I have three goals to achieve:

  1. I want to use some sort of regularization to choose the x variables that should go in the model.
  2. y is bounded between 0 and 24. So I want the predictions to also be bounded within this range.
  3. The data has spatial attributes and I want to use spatial cross-validation to re-sample while tuning regularization parameters.

I am planning to use the mlr package in R. Which learner can I use that can achieve the above three goals?

Many thanks.

Mihir Sharma
  • 51
  • 1
  • 7
  • Please give us some more detail, in particular on what you've tried. Many learners in mlr can achieve what you want, and you can easily try that. – Lars Kotthoff Feb 23 '19 at 00:31
  • Thanks Lars. I was initially trying `regr.glmnet` but I don't think I can force predictions to be within a desired range [(0,24) in my case]. I then explored transforming my dependent variable into a fraction b/w 0 & 1 and using fractional response regression but I am not sure if this is doable in mlr (I tried `regr.glm` but I can't change the family to quasibinomial) (https://stackoverflow.com/questions/37584715/fractional-response-regression-in-r). I know that tree-based algorithms necessarily yield internal predictions but I have a very small data and I have been advised not to use those. – Mihir Sharma Feb 23 '19 at 00:44
  • 1
    Well, there's no learner that allows you to force predictions to be within a certain range -- that's simply not how machine learning algorithms are set up. You can scale variables and do benchmark experiments in mlr that allow you to simply try lots of different learners and see which one works best. The tutorial has more detail on that. – Lars Kotthoff Feb 23 '19 at 02:31

0 Answers0