1

I am running the following weighted regression on clustered data, and I keep getting NAs for my clustered standard errors using the clx function. Please note that I am using na.action = na.exclude , the reason for that is that I eventually append the residuals from each observation to the dataframe for further manipulation.

 #partialing the data
 partialData <-  data[data$group_code2 %in% group_list,]

 #running the regression and calculating clustered standard errors
 reg1 <- lm(employed ~ age + eduction , data=partialData, weights = w, na.action=na.exclude)
 clx(fm=reg1, dfcw = 1,  cluster= partialData$group_code2)

 clx <- function(fm, dfcw, cluster){
 # R-codes (www.r-project.org) for computing
 # clustered-standard errors. Mahmood Arai, Jan 26, 2008.

 # The arguments of the function are:
 # fitted model, cluster1 and cluster2
 # You need to install libraries `sandwich' and `lmtest'

 # reweighting the var-cov matrix for the within model
 library(sandwich);library(lmtest)
 M <- length(unique(cluster))   
 N <- length(cluster)           
 K <- fm$rank                        
 dfc <- (M/(M-1))*((N-1)/(N-K))  
 uj  <- apply(estfun(fm),2, function(x) tapply(x, cluster, sum));
 vcovCL <- dfc*sandwich(fm, meat=crossprod(uj)/N)*dfcw
 coeftest(fm, vcovCL) }

I originally suspected that na.exclude is giving me trouble because the values of some of the residuals would be NA. So is there a way to have clx ignore the NAs from both the cluster and the object reg1?

Please note that I understand I can remove non complete cases from the dataset prior to running any regressions, and that would solve the issue, but I am avoiding that because I am trying many possible specifications (so my covariates keep on changing) and I am trying to avoid the need to change the code for every set of possible covariates.

1 Answers1

0

It is difficult to give you a helpful answer since your example is not reproducible. But the crux of the answer is that the original code for clustered standard errors by Mahmood Arai doesn't handle NAs.

In this case you could use the multiwayvcov package, which should handle your case:

require(lmtest)
require(multiwayvcov)

reg1 <- lm(employed ~ age + eduction, data=partialData, 
    weights = w, na.action=na.exclude)
coeftest(reg1, vcov=function(x) cluster.vcov(x, partialData$group_code2))  

See also:

Community
  • 1
  • 1
landroni
  • 2,902
  • 1
  • 32
  • 39