1

I have a data frame of different variables in R that represent indicators such as race, SAT score, and high school GPA, dropout rate, and gender. I am trying to regress dropout rate using these as right hand side inputs. However, I am only trying to do this for black and hispanic students, coding black as "B" and hispanic as "H" under race.

 newdata <- subset(x.20, race %in% c("B", "H"), select=c(race, individual.ind, institutional.ind, male, twohousehold, foreignbornparent, parentdegree, welfare, householdincome, schoolquality, SAT, privateschool, apcourses, socialdistance, peerinfluence, selfefficacy, selfesteem, hsgpa, droppedout))

mylogit <- glm(droppedout ~ race + individual.ind + institutional.ind + male + twohousehold + foreignbornparent + parentdegree + welfare + householdincome + SAT + privateschool + apcourses + hsgpa + schoolquality + socialdistance + peerinfluence + selfefficacy + selfesteem, family = binomial, data=newdata)


stargazer(mylogit, title="Title: Logit Regression Results", type = "latex", single.row = TRUE, header=FALSE, column.sep.width = "1pt", 
         digits = 1, covariate.labels=c("Race"))

The above code gives me a regression table in Stargazer but the regression coefficients are much different than the ones recorded in the data I am replication. Does anyone have any idea what is going wrong? Am I effectively subsetting all of the data as black and hispanic correctly?

Ghost Koi
  • 111
  • 1
  • 1
    When asking for help, you should include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions, How different are the values from what you expect? Were the initial values calculated from the same data set? There really aren't enough details here to say what's going on. – MrFlick Apr 24 '18 at 19:59

0 Answers0