0

Context

I have data frame with multiple columns and i have to do regression between alternate columns (like between 1&2, 3&4) and predict using a test data set and then find residuals. I do not want to reference each column using column name (like target.data$nifty), hence I am using data matrix.

Problem

When i predict using a data matrix, I am getting the following error:

number of items to replace is not a multiple of replacement length

I know my training data set has 255 rows and test data set has 50 rows but it should be able to predict on test data set and give me 50 predicted values, but it is not.

b is a matrix having training data set of 255 rows and 8 columns and t is a matrix having test data set of 50 rows and 8 columns.

I tried using matrix directly as newdata in predict but it was giving following error:

Error in eval(predvars, data, env) : numeric 'envir' arg not of length one

... so i converted it into a data frame. Please suggest how can I use matrix inside predict.

Here's the code i am using:

b<-as.matrix(target.data)

t<-as.matrix(target.test)

t_pred <- matrix(,nrow = 50,ncol = 8)

t_res <- matrix(,nrow = 50,ncol = 8)



for(i in 1:6) {

  if (!i %% 2) {

    t_model <- lm(b[,i] ~ b[,i])

    t_pred[,i] <- predict(t_model, newdata=data.frame(t[,i]), type = 'response')

    t_res[,i] <- t_pred[,i] - t[,i] 

  }
}
Maciej Jureczko
  • 1,560
  • 6
  • 19
  • 23
  • here's the code b<-as.matrix(target.data) t<-as.matrix(target.test) t_pred <- matrix(,nrow = 50,ncol = 8) t_res <- matrix(,nrow = 50,ncol = 8) for(i in 1:6) { if (!i %% 2) { t_model <- lm(b[,i] ~ b[,i]) t_pred[,i] <- predict(t_model, newdata=data.frame(t[,i]), type = 'response') t_res[,i] <- t_pred[,i] - t[,i] } } – Rahul Verma Oct 08 '17 at 16:29
  • Please provide a Minimal, Complete, Verifiable Example, including the data necessary to reproduce the error. http://stackoverflow.com/help/mcve – Hack-R Oct 08 '17 at 16:34
  • here's the link to training data set and test data set https://drive.google.com/file/d/0B0aDk67OMbPFY1lhc3A1Nk1NekE/view?usp=sharing https://drive.google.com/file/d/0B0aDk67OMbPFbHRxMGRXdFA4QVU/view?usp=sharing – Rahul Verma Oct 08 '17 at 18:13
  • Thank you. Unfortunately Stack Overflow Meta decided not to allow cloud links due to (1) security problems like viruses and (2) because they tend to break over time and thus it removes the value of the question for future readers. Please read this and my earlier link for instructions on how to make a reproducible example https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Hack-R Oct 08 '17 at 18:16
  • Training data > dput(head(target.data,2)) structure(list(NIB.SJ.EQUITY = c(365.13, 365.13), JALSH.INDEX = c(6625.25, 6692.83), X8374857Q.US.Equity = c(9.92, 9.56), SPX.INDEX = c(899.72, 890.16), CTSH.US.EQUITY = c(3.0572, 3.1385), SPX.INDEX.1 = c(895.75, 897.38), KBC.BB.EQUITY = c(23.401, 23.2689), BEL20.INDEX = c(1865.94, 1871.84)), .Names = c("NIB.SJ.EQUITY", "JALSH.INDEX", "X8374857Q.US.Equity", "SPX.INDEX", "CTSH.US.EQUITY", "SPX.INDEX.1", "KBC.BB.EQUITY", "BEL20.INDEX"), row.names = 1:2, class = "data.frame") – Rahul Verma Oct 08 '17 at 18:39
  • Test data > dput(head(target.test,2)) structure(list(NIB.SJ.EQUITY = c(270.89, 270.89), JALSH.INDEX = c(7445.33, 7490.17), X8374857Q.US.Equity = c(10.8, 10.85), SPX.INDEX = c(1039.58, 1036.3), CTSH.US.EQUITY = c(5.524, 5.6099), SPX.INDEX.1 = c(1034.15, 1042.44), KBC.BB.EQUITY = c(32.8847, 32.1653), BEL20.INDEX = c(2358.87, 2315.08)), .Names = c("NIB.SJ.EQUITY", "JALSH.INDEX", "X8374857Q.US.Equity", "SPX.INDEX", "CTSH.US.EQUITY", "SPX.INDEX.1", "KBC.BB.EQUITY", "BEL20.INDEX"), row.names = 1:2, class = "data.frame") – Rahul Verma Oct 08 '17 at 18:40
  • If I am using column names directly then following error is coming > t_model <- lm(target.data$NIB.SJ.EQUITY ~ target.data$JALSH.INDEX) > pred <- predict(t_model, newdata=target.test$JALSH.INDEX, type = 'response') Error in eval(predvars, data, env) : numeric 'envir' arg not of length one – Rahul Verma Oct 08 '17 at 18:41
  • Solved the error of "Error in eval(predvars, data, env) : numeric 'envir' arg not of length one" in case i am using column names by mod <- lm(NIB.SJ.EQUITY ~ JALSH.INDEX,data = target.data) predic <- predict(mod,newdata=target.test) --but the error is still coming if i am using matrix – Rahul Verma Oct 09 '17 at 05:03

0 Answers0