2

Good afternoon, I have a question regarding my function down below. The task is to develop a function in R that computes heteroskedasticity-robust confidence intervals for the results of the betas of a linear regression.

As I have tried to do so, my function does not return any output. The console simply doesn´t do anything after trying to get some results from it. I really argue why especially if I compute it manually by the last two rows of my code it works out all fine. Even though you dont have the necessary data.frames, perhaps you can take a look at my code and tell me what is wrong about it or propose an alternative way to solve my problem :)

For clarity: the original numerous values (using all 200 data points each) of the coefficients are c(463.2121, 139.5762), the stdHC are c(74.705054, 5.548689) as given by the lm model and for HC-robust standard errors I use the package sandwich.

my_CI <- function (mod, level = 0.95)
{
  `%>%` <- magrittr::`%>%`
  standard_deviation <- stderrorHC
  Margin_Error <- abs(qnorm((1-0.95)/2))*standard_deviation 
  df_out <- data.frame(stderrorHC, mod,Margin_Error=Margin_Error,
                       'CI lower limit'=(mod - Margin_Error),
                       'CI Upper limit'=(mod + Margin_Error)) %>%
    return(df_out)
}

my_CI(mod, level = 0.95) #retrieving does not return any results for me

Definitions:
women <- read.table("women.txt")
men <- read.table("men.txt")
converged <- merge(women, men, all = TRUE)
level <- c(0.95, 0.975)
modell <- lm(formula = loan ~ education, data = converged)
mod <- modell$coefficients
vcov <- vcovHC(modell, type = "HC1")
stderrorHC <- sqrt(diag(vcov))

mod - abs(qnorm((1-level[1])/2))*stderrorHC 
mod + abs(qnorm((1-level[1])/2))*stderrorHC

Addition: Here is some data from the original dataset. I included just ten data points so we would need to construct the confidence interval upon the t-distributon in this case.

dataMenEductaion <- c(12, 17, 16, 11, 20, 20 , 11, 19, 15, 16)
dataMenLoan <- c(2404.72, 3075.313, 2769.543, 2009.295, 3105.121, 4269.216
                   2213.730, 4025.136, 2605.191, 2760.186)
dataWomenEducation <- c(12, 14, 16, 19 , 12, 19, 20, 17, 16, 10)
dataWomenLoan <- c(1920.667, 2278.255, 2296.804, 2977.048, 1915.740, 3557.991, 
                   3336.683, 2923.040, 2628.351, 1918.218)
Todgeweiht
  • 23
  • 3
  • Welcome to SO Todgeweiht! Could you please provide some example data, which the community could use to solve you issue? Please refer to [this page](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for more information. – Dion Groothof Jan 09 '22 at 13:38
  • For sure, no problem. But I dont see a way to provide the original data (.txt-format). Should I just copy paste like 10 rows of each data.frame below my question or is there any other way? – Todgeweiht Jan 09 '22 at 13:49
  • I have now added some data for example. – Todgeweiht Jan 09 '22 at 14:09

1 Answers1

3

I believe that the following provides you with the desired output.

# install.packages('sandwich')
library(sandwich) # contains vcovHC()

# data
df <- data.frame(education = c(12, 17, 16, 11, 20, 20, 11, 19, 15, 16,
                              12, 14, 16, 19 , 12, 19, 20, 17, 16, 10),
                loan = c(2404.72, 3075.313, 2769.543, 2009.295, 3105.121, 4269.216,
                         2213.730, 4025.136, 2605.191, 2760.186,
                         1920.667, 2278.255, 2296.804, 2977.048, 1915.740, 3557.991, 
                         3336.683, 2923.040, 2628.351, 1918.218))
df$sex <- factor(gl(2, nrow(df)/2, labels = c('males', 'females')))

# linear model
fit <- lm(loan ~ education + sex, data = df)
coefs <- fit$coefficients
vcov <- vcovHC(fit, type = "HC1")
stderrorHC <- sqrt(diag(vcov))

# function to compute robust SEs
my_CIs <- function (coefs, level = 0.95) {
  standard_deviation <- stderrorHC
  Margin_Error <- abs( qnorm( (1-level)/ 2) ) * standard_deviation 
  df_out <- data.frame(stderrorHC, coefs, Margin_Error = Margin_Error,
                       'CI lower limit' = (coefs - Margin_Error),
                       'CI Upper limit' = (coefs + Margin_Error))
  return(df_out)
}

Output

> my_CIs(coefs = coefs)
stderrorHC     coefs Margin_Error CI.lower.limit CI.Upper.limit
(Intercept)  295.86900  160.3716    579.89259      -419.5210      740.26416
education     23.64313  176.0111     46.33968       129.6714      222.35073
sexfemales   132.07169 -313.2632    258.85576      -572.1189      -54.40743
Dion Groothof
  • 1,406
  • 5
  • 15
  • Thank you, Dion! Im pleased with the great support here, your solution works out great!:) Have a nice weekend, see you. For an easier wrap-up: Just delete this magrittr-condition implemented above. Nick – Todgeweiht Jan 09 '22 at 18:06