I am a beginner in R so I'm sorry if my question is basic and has been answered somewhere else but unfortunately I could not find the answer.
One of my predictor variables, nationality
, has 8 levels.
I want to create a user defined function that loops through each level in my variable nationality
, taking one level per regression. I created a list of the levels of the variable nationality
as such:
mylist <- list("bangladeshian", "british", "filipino", "indian",
"indonesian", "nigerian", "pakistani", "spanish")
then created a user defined function:
f1 <- function(x) {
l <- summary(glm(smoke ~ I(nationality == mylist[x]),
data=df.subpop, family=binomial(link="probit")))
print(l)
}
f1(2)
f1(2)
gives this output:
Call:
glm(formula = smoke ~ I(nationality == mylist[x]),
family = binomial(link = "probit"), data = df.subpop)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.629 -0.629 -0.629 -0.629 1.853
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.9173 0.1659 -5.530 3.21e-08 ***
I(nationality == mylist[x])TRUE -4.2935 376.7536 -0.011 0.991
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 73.809 on 78 degrees of freedom
Residual deviance: 73.416 on 77 degrees of freedom
AIC: 77.416
Number of Fisher Scoring iterations: 14
As you can see, the coefficient for nationality is "I(nationality == mylist[x])TRUE"
which is not very informative and requires the user to refer back to the line of code
f1(2)
and also to mylist
to understand the level that that coefficient represents. I believe there should be a cleaner and more straightforward way to do this and accurately run a regression for each level without having to call f1()
8 times.