1

I have a dataset of 15 variables (1 under examination, and 14 its regressors) all numeric. What I do is that i run an algorithm that is a recursive forecasting technique. This algorithm cuts the data in an in-sample and an out-sample. Here I want to figure out how to store the results produced for each value of a and t, which are parameters of the cv.hqreg function (hqreg package).

  • Note: That for each value of t and a we get 1 value (the one in the code as predicedQ. For each of those t and a we run the cv.hqreg 648 times. And then again 648 times for the next value of t and a. Thus the ending result will be a matrix/dataset of 648 rows and 231 columns.

For each cv.hqreg I get 100 fitted models from which I select the one with the smallest error via this LMQ$fit$beta[,which(LMQ$lambda.min==LMQ$lambda)] command line.

   dataR<-TRAINSET
    fittedvaluesQRidge<-NULL
        for(i in 1:(nrow(TESTSET)-1)){ #adding a new row and repeat
          for(a in seq(0,1,0.1)){ #for each penalty of selection
            for(t in seq(0,1,0.05)){ #for each quantile
      print(i)                         #to see it works/or where stops
      dataR<-rbind(dataR,TESTSET[i,])  #update dataset
      LMQ<-cv.hqreg(as.matrix(dataR[,-15]),dataR$LHS,method = "quantile",tau=t,alpha = a)  #FIT THE Lasso Quantile-MODEL 
      predictdQR<-LMQ$fit$beta[1,which(LMQ$lambda.min==LMQ$lambda)]+LMQ$fit$beta[2,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,1]+LMQ$fit$beta[3,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,2]+LMQ$fit$beta[4,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,3]+LMQ$fit$beta[5,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,4]+LMQ$fit$beta[6,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,5]+LMQ$fit$beta[7,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,6]+LMQ$fit$beta[8,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,7]+LMQ$fit$beta[9,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,8]+LMQ$fit$beta[10,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,9]+LMQ$fit$beta[11,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,10]+LMQ$fit$beta[12,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,11]+LMQ$fit$beta[13,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,12]+LMQ$fit$beta[14,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,13]+LMQ$fit$beta[15,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,14] #find the forecasts
      fittedvaluesQRidge<-c(fittedvaluesQRidge,predictdQR) #then put them in a vector 
        }
      }
    }

The commands I have used to get the predicted value are quite extensive using each one variable at a time. However I have tried to use matrix algebra (matrix of the covariates %*% data with no results but an error: non-numeric argument to binary operator. It works, in an ugly yes way, but if there is a shorter way I would like all the assistance.

Hercules Apergis
  • 423
  • 6
  • 20
  • So what is your question? Your title says *saving results of a nested loop*, but you are saving data per last vector, `fittedvaluesQRidge`. Does code not work? Undesired results? – Parfait Dec 02 '16 at 03:51
  • Yes i get the data in a vector. But it is like '11*21*658' long. While i want to make it come out as a matrix. For example: for t=0.1 and a=0 i get their 468 results. Then for t=0.2 and a=0 i get their next 648 and so on. I really can visualise how to set up the results to come out as a matrix. – Hercules Apergis Dec 02 '16 at 10:19
  • I thought you were familiar from your [last question](http://stackoverflow.com/questions/40667261/how-to-apply-a-regression-in-a-for-loop-for-all-the-variables-of-a-dataset-while)! I see you do not use any apply functions. `Sapply` can bind the *predictedQR* vectors in a matrix. Where is *a* and *t* loop variables being used? – Parfait Dec 02 '16 at 14:33
  • I really tried but i failed miserably. Well the 't' and 'a' are used inside the 'cv.hqreg' function. – Hercules Apergis Dec 02 '16 at 14:54
  • Do you need a matrix or an array? You are showing more than two dimensions: 11 X 21 X 658. – Parfait Dec 02 '16 at 22:10
  • actually it is a matrix of which there are 648 rows. Now column 1 has the values for t=0 a=0. Then col 2 has the values for t=0 a=0.1 etc. (With t having 21 values and a 11 values) – Hercules Apergis Dec 03 '16 at 00:55

1 Answers1

1

Consider sapply() with expand.grid() as sapply can take multiple input lists or vectors, similar to the nested for loops but returns a matrix. And with expand.grid which cross joins two lists in a data.frame object, you can capture every combination between a and t:

at_combns <- expand.grid(a=seq(0,1,0.05), t=seq(0,1,0.1))

matpredictdQR <- sapply(seq(nrow(at_combns)), function(j, i){
  # UPDATE dataset
  dataR <- rbind(TRAINSET, TESTSET[1:i,])

  # FIT THE Lasso Quantile-MODEL 
  LMQ <- cv.hqreg(as.matrix(dataR[,-15]),dataR$LHS,method = "quantile",
                  tau=at_combns$t[j], alpha=at_combns$a[j]) 

  predictdQR <-LMQ$fit$beta[1,which(LMQ$lambda.min==LMQ$lambda)]+
               LMQ$fit$beta[2,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,1]+
               LMQ$fit$beta[3,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,2]+
               LMQ$fit$beta[4,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,3]+
               LMQ$fit$beta[5,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,4]+
               LMQ$fit$beta[6,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,5]+
               LMQ$fit$beta[7,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,6]+
               LMQ$fit$beta[8,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,7]+
               LMQ$fit$beta[9,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,8]+
               LMQ$fit$beta[10,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,9]+
               LMQ$fit$beta[11,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,10]+
               LMQ$fit$beta[12,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,11]+
               LMQ$fit$beta[13,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,12]+
               LMQ$fit$beta[14,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,13]+
               LMQ$fit$beta[15,which(LMQ$lambda.min==LMQ$lambda)]*TESTSET[i+1,14] 

  return(predictdQR)

}, seq(nrow(TESTSET)-1))
Parfait
  • 104,375
  • 17
  • 94
  • 125
  • I dont think this updates each run the matrix. Like run 1: 361 rows, then run 2 362 rows, run 3 363 rows and so on. I used the `print(dim(dataR))` and i got that `dataR` was now a matrix `1x15`. Using `rbind(dataR,TESTSET[i,]` i think will work. Also I think you need to add a `)` at the end of the whole code to close the sapply function. – Hercules Apergis Dec 03 '16 at 13:16
  • Also, everytime the code goes to run the `cv.hqreg` function `R` just crashes. I know it does crash cause when `cv.hqreg` runs the function prints at which cross validation you are. – Hercules Apergis Dec 03 '16 at 13:26
  • Ok after some pc change the code run. The only issue is that it doesn't update the dataR. Both `rbind(dataR,TESTSET[1:i,] and rbind(dataR,TESTSET[i,])` don't update it. The first gives a `dim = 1008x15` and the latter a `dim = 362x15` in each run. – Hercules Apergis Dec 03 '16 at 14:04
  • Function never returns dataR, but temporarily updates it in `sapply` to calculate the predictdQR which I though was the main focus here. Did you not retreive desired results of predicted values matrix? Have dataR updated in a separate for loop. Now dataR has TRAINSET to a `rbind` call. See edit. – Parfait Dec 03 '16 at 15:07
  • Sadly `rbind(dataR,TESTSET[1:i,]` for each run gives that dataR is 362 rows. It doesn;t seem to update the matrix which is weird given that we `rbind`. Same results for `rbind(dataR,TESTSET[i,]` but here we have straight 1008 rows. I don't understand why. – Hercules Apergis Dec 03 '16 at 22:35
  • Did you see update, using `TRAINSET`? See my answer again. – Parfait Dec 04 '16 at 00:05
  • sorry for taking long to respond, but the code runs slowly and it takes much time to get to see that it actually works. Will get back to you soon! (like a few hours). Thank you. – Hercules Apergis Dec 04 '16 at 01:15
  • Still the same problem! the dataset doest update...it keeps staying 362 rows. – Hercules Apergis Dec 04 '16 at 11:33
  • Like I mentioned above, dataR will not update in `sapply` as it is not returned only used as temp variable to calculate predictQR and fitted values. Does the fitted values matrix return to correct dims (648 x 231) and data points? – Parfait Dec 04 '16 at 14:18
  • yes it returns the correct matrix. But i have added the `print(dim(dataR))` command, and for each run it gives a 362x15 dataset. Shouldn't this for each run make a bigger and bigger in rows? – Hercules Apergis Dec 04 '16 at 14:37
  • Wow just learned something new! Replace `<-` with `<<-` operator to write to objects in outer scope. See [here](http://stackoverflow.com/a/13640219/1422451). So: `dataR <<- rbind(TRAINSET, TESTSET[1:i,])`. This will update dataR. – Parfait Dec 04 '16 at 16:28
  • i really wish i could attach an image cause it really stopped at dim 362x15. Maybe if i specify that `i<-1:648`. I have a feeling that i is not specified specifically, and thus it considers `i=1`. i really don't know. Here is a pic of how it ended: https://www.dropbox.com/s/9ci4y33d667jl93/Screenshot%202016-12-05%2001.36.46.png?dl=0 THE dimentions of the `matpredictdQR` however is correct thus I really don't understand what is happening. – Hercules Apergis Dec 05 '16 at 01:37
  • Forgive me, did you try my suggestion in last comment? Your image cuts off. It works great on my mockup. Note I did not adjust my answer to it. And again, dataR *does* change inside the anonymous sapply function so that's why final matrix works. Just the outer scope dataR doesn't unless you use `<<-` (notice the double angle bracket). – Parfait Dec 05 '16 at 02:02
  • yes i did use the `<<-` . I don't really understand why it doesn't show. Maybe it is the `print` command? that it isn't able to keep printing as the lines are being added?! – Hercules Apergis Dec 05 '16 at 11:50
  • What is the `dim()` of dataR prior to the sapply call? Your screenshot shows several scripts (some Untitled) on the same environment. Maybe you overwrite it? Try running a clean Rstudio session with just the one specific script. Finally, is your initial question not answered? Does the matrix correctly output your needed results? – Parfait Dec 05 '16 at 18:49
  • `dim()` shows the dimensions of the dataR, like number of rows and columns. The scripts are all on the same topic but previous steps. – Hercules Apergis Dec 06 '16 at 00:48
  • I know what the `dim()` function is. I meant what does it show (rows/cols) prior to sapply? Always `362,15`? Try removing everything inside sapply but the `data <<- ...` and your `print` call. Does it change with each iteration? – Parfait Dec 06 '16 at 01:06
  • not prior, but inside the `sapply`. I have set it up inside the `function(j,i)`, which is called by the `sapply` – Hercules Apergis Dec 06 '16 at 01:44