Running regressions where entire columns contain NA in R

Question

I have two dataframes df1 and df2, where df1 and df2 contains the dependant and independent variables for three different entities, b, c and d, respectively. I want to find the impact of the X variables on the Y variables using a loop function, and outputting them in an array. When I run the code, it returns this error:

Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 0 (non-NA) cases

I'm assuming the problem arises because column "bx" in df2 is entirely filled with NAs. I thought including na.action with na.omit might do the trick but the same error arises.

df1 <- data.frame(
    ay = c(2000, 2001, 2002, 2003),
    by = c(5,6,7,8),
    cy = c(1,2,3,4),
    dy = c(NA, NA, 8, 9)
)

df2 <- data.frame(
    ax = c(2000, 2001, 2002, 2003),
    bx = c(9,10,11,12),
    cx = c(NA, NA, NA, NA),
    dx = c(NA, NA, 3, 4)
)

out=array(0,c(2,4))
for (i in 1:2) {
      fit = lm(df1[,1+i] ~ df2[,1+i], na.action = na.omit)
      out[i,1:2]=summary(fit)$coef[,1] 
      out[i,3:4]=summary(fit)$coef[,3] 
}

Ideally I would like to omit the regression for cy against cx by recognising that cx contains no data, but am unsure how to code for this.

Please check this: https://stackoverflow.com/questions/44200195/how-to-debug-contrasts-can-be-applied-only-to-factors-with-2-or-more-levels-er — Duck, Sep 15 '20 at 17:44
what is the model supposed to do when you don't give it any data? All observations are missing and `lm` is saying I can't do anything with that. `na.omit` just drops NAs but you have to have some non NA values left over — Nate, Sep 15 '20 at 18:21
@Nate Since my original dataset contains large numbers of variables (entities) I was hoping the model would return values only where it has data to run the regression. — Trong Nguyen, Sep 16 '20 at 16:35

Running regressions where entire columns contain NA in R

0 Answers0