0

I built a logistic regression model (called 'mylogit') using the glm function in R as follows:

mylogit <- glm(answer ~ as.factor(gender) + age, data = mydata, family = "binomial")

where age is numeric and gender is categorical (male and female).

I then proceeded to make predictions with the model built.

pred <- predict(mylogit, type = "response")

I can easily make a time series plot of the predictions by doing:

plot.ts(ts(pred))

to give a plot that looks like this:

Plot of Time against Predictions

which gives a plot of the predictions.

My question is this: Is it possible to put the x axis in segments according to gender (male or female) which was specified in the glm? In other words, can I have predictions on the y axis and have gender (divided into male and female) on the x axis?

A sample of the data I want to plot from is similar to this:

I did:

bind = cbind(mydata, pred)

'bind' looks like this:

pred          age        gender
0.9461198     32          male
0.9463577     45         female
0.9461198     45         female
0.9461198     37         female
0.9477645     40          male
0.8304513     32         female
Mikee
  • 783
  • 1
  • 6
  • 18
  • So basically you want to sort the x-axis, not add any additional axes? In that case I think you should change the title of the question. – Backlin Feb 25 '16 at 08:11
  • Great, could you also add a small sample dataset, e.g. with `dput(head(mydata))`? I guess the problem is that the object you get from `ts(pred)` is already sorted in some random unwanted way, but it is hard to tell without having the same data as you. Perhaps sorting `mydata` based on gender before modelling with `glm` might solve the problem even. – Backlin Feb 25 '16 at 08:36
  • @Backlin I've added a sample of the data set to the original question. – Mikee Feb 25 '16 at 08:57

2 Answers2

1

Check out #4 on this blog post, "4. How To Create Two Different X- or Y-axes".

My suggestion to you is that you look at some of the dedicated R plotting tools, like ggplot2.

  • 1
    Note that in ggplot it is actively discouraged to have multiple axis and quite the trick to achieve - see this question http://stackoverflow.com/questions/3099219/plot-with-2-y-axes-one-y-axis-on-the-left-and-another-y-axis-on-the-right – bdecaf Feb 25 '16 at 08:05
  • 2
    `ggplot2` won't help you here. Being based on the [grammar of graphics](http://www.springer.com/gb/book/9780387245447), it considers any graph requiring multiple axes to be inherently flawed in its design. You have to stick with base graphics as far as I know. – Backlin Feb 25 '16 at 08:06
  • Good points about ggplot theory and execution. I suggested ggplot precisely because it does have a very specific way of doing things, i.e., a framework, and one that helps you think about how you're going to use R well. – Dheeraj Chand Feb 25 '16 at 08:19
0

I don't think you need to use ts and plot.ts because the data you have is not a time series, right? Just sort pred before plotting.

# Get data
str <- "pred,age,gender
0.9461198,32,male
0.9463577,45,female
0.9461198,45,female
0.9461198,37,female
0.9477645,40,male
0.8304513,32,female"
bind <- read.csv(textConnection(str))

# Plot
bind <- bind[order(bind$gender),]
plot(bind$pred, col = bind$gender)

library(ggplot2)
ggplot(bind, aes(x = gender, y = pred)) +
  geom_point(position = position_jitter(width = .3))

Or without creating bind you could do plot(pred[order(mydata$gender)]).

Backlin
  • 14,612
  • 2
  • 49
  • 81