0

I am testing for multicollinearity in a dataset. I am able to run VIF() just fine in order to test that there is multicollinearity. However, when I run vif() to test each variable to see if one may be worth removing, I continue to get the same error each time, even though a classmate of mine has almost the exact same code and his works.

From my understanding, in VIF() you want to run a model, like lm() of the data, so I did that and that works. But vif() I believe you should just be able to plug in a data frame, so I've tried making my data a data.frame before putting it through vif(), but that does not seem to work. Just to see what happens, I also tried running the lm() of the data through vif(), but that definitely doesn't work.

    d <- read.table('9.10data.txt', col.names = c('y', 
    'x1','x2','x3','x4'))
    reg <- lm(data = d, y~x1+x2+x3+x4)
    VIF(reg)
    # VIF = 26.94823 > 10 so multicollinearity is present.
    d <- data.frame(d)
    vif(d)

I would expect to get a sort of matrix that shows the vif values for each variable x1, x2, x3, x4, but I keep getting the error message: Error in y[, i] : incorrect number of dimensions

If the data would help, go to http://users.stat.ufl.edu/~rrandles/sta4210/Rclassnotes/data/textdatasets/KutnerData/Chapter%20%209%20Data%20Sets/CH09PR10.txt and I just copy and pasted the data into a text editor and saved as a txt file.

UseR10085
  • 7,120
  • 3
  • 24
  • 54
DustyJ
  • 1

1 Answers1

0

The vif function in usdm package contains y within it. So, if you use y as variable in your dataset you will encounter error. Hence renaming y removes the error like

library(usdm) # For vif function
library("fmsb") # For VIF function
d <- read.table('try.txt', col.names = c('z', 'x1','x2','x3','x4'))
reg <- lm(data = d, z~x1+x2+x3+x4)
VIF(reg)
# VIF = 26.94823 > 10 so multicollinearity is present.
d <- data.frame(d)
vif(d)
# Variables       VIF
# 1         z 26.948231
# 2        x1  3.711784
# 3        x2  1.419321
# 4        x3 12.570942
# 5        x4  5.034769

Thanks to @jav for providing such a great detail in https://stackoverflow.com/a/58742464/6123824

UseR10085
  • 7,120
  • 3
  • 24
  • 54
  • The packages I used are 'fmsb' for VIF and 'usdm' for vif. I can run VIF just fine on the reg, but when I try vif(reg) I receive an error: unable to find an inherited method for function ‘vif’ for signature ‘"lm"’ – DustyJ Nov 07 '19 at 21:38
  • I will try the package car, however, like I said, my classmate is able to run it just fine using virtually the exact same code and he uses the dataset as the argument for vif(), so there must be something I'm missing. Also, when I run VIF(reg) I get a single value 26.9, which tells me there is multicollinearity, which is what I believe I am supposed to get. – DustyJ Nov 07 '19 at 21:41
  • So I tried installing both of the packages you mentioned, now the VIF() function basically does what the vif() function does instead of gives me a single value for the regression equation as a whole. My professor gave us the packages I mentioned above to use, so I need to have vif() run correctly from the 'usdm' package, which supposedly takes in a data frame. – DustyJ Nov 07 '19 at 21:47
  • You see the answer here https://stackoverflow.com/a/58742464/6123824. – UseR10085 Nov 07 '19 at 23:00
  • @DustyJ see my edit now. It is working perfectly ok. – UseR10085 Nov 08 '19 at 15:09