13

I've fitted a logistic regression model that predicts the a binary outcome vs from mpg (mtcars dataset). The plot is shown below. How can I determine the mpg value for any particular vs value? For example, I'm interested in finding out what the mpg value is when the probability of vs is 0.50. Appreciate any help anyone can provide!

model <- glm(vs ~ mpg, data = mtcars, family = binomial)

ggplot(mtcars, aes(mpg, vs)) + 
    geom_point() + 
    stat_smooth(method = "glm", method.args = list(family = "binomial"), se = FALSE)

enter image description here

hsl
  • 670
  • 2
  • 10
  • 22

2 Answers2

10

The easiest way to calculate predicted values from your model is with the predict() function. Then you can use a numerical solver to find particular intercepts. For example

findInt <- function(model, value) {
    function(x) {
        predict(model, data.frame(mpg=x), type="response") - value
     }
}

uniroot(findInt(model, .5), range(mtcars$mpg))$root
# [1] 20.52229

Here findInt just takes the model and a particular target value and returns a function that uniroot can solve for 0 to find your solution.

MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • That doesn't look right. Shouldn't the answer be more like 20.5? – eipi10 Aug 16 '15 at 22:34
  • 1
    Oh, I think you just need `type="response"` in `predict` so that you're returning probabilities instead of log(odds). – eipi10 Aug 16 '15 at 22:34
  • Ah, i didn't realize `stat_smooth ` was plotting the response by default. I've updated. – MrFlick Aug 16 '15 at 22:35
  • Thanks a lot! I've tried to solve it by hand and I can't find a function in R that allows me to do it quickly. So I'm guessing the `findInt` function you've defined should be the simplest way to solve for x. – hsl Aug 16 '15 at 22:49
  • These were helpful. I wonder if there is any way to find such value without using an `lm` or `glm`. In other words, could we directly find the x value corresponding to a mathematical function defined in `R`? – David C. Jan 24 '17 at 19:54
  • @ydavidchen you can use functions like optim() or uniroot() with arbitrary functions. – MrFlick Jan 24 '17 at 20:19
  • how can you apply the solution when you have multiple regression? like when you have m<-glm(Y~ X1+X2 , data=data, family="binomial") where X1 is factor and X2 is continuous. and we are interested in finding X2 value. – Malek Ik Apr 11 '20 at 03:36
7

You can solve for mpg directly as follows:

mpg = (log(p/(1-p)) - coef(model)[1])/coef(model)[2]

Detailed explanation:

When you fit the regression model, the equation you are fitting is the following:

log(p/(1-p)) = a + b*mpg

Where p is the probability that vs=1, a is the intercept and b is the coefficient of mpg. From the model fit results (just type model or summary(model)) we see that a = -8.8331 and b = 0.4304. We want to find mpg when p=0.5. So, the equation we need to solve is:

log(0.5/(1-0.5)) = -8.331 + 0.4304*mpg
log(1) = 0 = -8.331 + 0.4303*mpg

Rearranging,

mpg = 8.8331/0.4304 = 20.523

In general, to solve for mpg for any value of p:

mpg = (log(p/(1-p)) + 8.8331)/0.4304

Or, to make it more easily reproducible:

mpg = (log(p/(1-p)) - coef(model)[1])/coef(model)[2]
eipi10
  • 91,525
  • 24
  • 209
  • 285