I am running the following weighted regression on clustered data, and I keep getting NAs for my clustered standard errors using the clx function. Please note that I am using na.action = na.exclude , the reason for that is that I eventually append the residuals from each observation to the dataframe for further manipulation.
#partialing the data
partialData <- data[data$group_code2 %in% group_list,]
#running the regression and calculating clustered standard errors
reg1 <- lm(employed ~ age + eduction , data=partialData, weights = w, na.action=na.exclude)
clx(fm=reg1, dfcw = 1, cluster= partialData$group_code2)
clx <- function(fm, dfcw, cluster){
# R-codes (www.r-project.org) for computing
# clustered-standard errors. Mahmood Arai, Jan 26, 2008.
# The arguments of the function are:
# fitted model, cluster1 and cluster2
# You need to install libraries `sandwich' and `lmtest'
# reweighting the var-cov matrix for the within model
library(sandwich);library(lmtest)
M <- length(unique(cluster))
N <- length(cluster)
K <- fm$rank
dfc <- (M/(M-1))*((N-1)/(N-K))
uj <- apply(estfun(fm),2, function(x) tapply(x, cluster, sum));
vcovCL <- dfc*sandwich(fm, meat=crossprod(uj)/N)*dfcw
coeftest(fm, vcovCL) }
I originally suspected that na.exclude is giving me trouble because the values of some of the residuals would be NA. So is there a way to have clx ignore the NAs from both the cluster and the object reg1?
Please note that I understand I can remove non complete cases from the dataset prior to running any regressions, and that would solve the issue, but I am avoiding that because I am trying many possible specifications (so my covariates keep on changing) and I am trying to avoid the need to change the code for every set of possible covariates.