1

I'm assigned to create an Odds of Ratio ggplot in R. The plot I'm supposed to create is given below.

Given plot

My job is to figure out codes which creates the exact plots in R. I've done most parts. Here is my work.

My work

Before jumping into my code, it is very important that I am not using the correct values for boxOdds, boxCILow, and boxCIHigh since I have not figured out the correct values. I wanted to figure out codes for ggplot first so I can enter the right values as soon as I find them.

This is the code I used:

library(ggplot2)

boxLabels = c("Females/Males", "Student-Centered Prac. (+1)", "Instructor Quality (+1)", "Undecided / STM", 
              "non-STEM / STM", "Pre-med / STM", "Engineering / STM", "Std. test percentile (+10)", 
              "No previous calc / HS calc", "College calc / HS calc")

df <- data.frame(yAxis = length(boxLabels):1,
                 boxOdds = 
                   c(2.23189, 1.315737, 1.22866, 0.8197413, 0.9802449, 0.9786673, 0.6559005, 0.5929812, 0.6923759, 1.3958275),
                 boxCILow = 
                   c(.7543566,1.016,.9674772,.6463458,.9643047,.864922,.4965308,.3572142, 0.4523759, 1.2023275),
                 boxCIHigh = 
                   c(6.603418,1.703902,1.560353,1.039654,.9964486,1.107371,.8664225,.9843584, 0.9323759, 1.5893275) 
)


(p <- ggplot(df, aes(x = boxOdds, y = boxLabels)) + 
    geom_vline(aes(xintercept = 1), size = 0.75, linetype = 'dashed') +
    geom_errorbarh(aes(xmax = boxCIHigh, xmin = boxCILow), size = .5, height = 
                     0, color = 'gray50') + 
    geom_point(size = 3.5, color = 'orange') +
    theme_bw() +
    theme(panel.grid.minor = element_blank()) +
    scale_x_continuous(breaks = seq(0,7,1) ) +
    ylab('') +
    xlab('Odds Ratio') + 
    annotate(geom = 'text', y =1.1, x = 3.5, label ='', 
             size = 3.5, hjust = 0) + ggtitle('Estimated Odds of Switching') + 
    theme(plot.title = element_text(hjust = 0.5, size = 30), 
          axis.title.x = (element_text(size = 15))) + 
    theme(panel.grid.minor = element_blank(), panel.grid.major = element_blank())
)
p

Where I'm stuck at:

  1. Removing small vertical lines on the beginning and end of each row's CI). I was not sure what it's called so I was having hard time looking it up. SOLVED

  2. I'm also stuck at coloring specific rows in different colors.

  3. The last part I'm stuck at is assigning proper order of each variable for y-axis. As you can see in my code ("boxLabels" part), I have put all the variables in order of given plot but it seems like the R didn't care about the order. So the varaible located at the very top is "Undecided / STM", instead of "Females / Males".

  4. How do I decrease the space from 0 to 1? SOLVED

Any help would be appreciated!

hank
  • 25
  • 5
  • 1
    All your questions have answers somewhere on SO: 1. https://stackoverflow.com/questions/45693025/remove-error-bar-ends-in-r-using-ggplot2 2. https://stackoverflow.com/questions/6919025/how-to-assign-colors-to-categorical-variables-in-ggplot2-that-have-stable-mappin 3. https://stackoverflow.com/questions/5208679/order-bars-in-ggplot2-bar-graph 4. simply remove coord_trans? – erc Apr 01 '19 at 06:09
  • Question 4 worked perfectly, thanks! – hank Apr 01 '19 at 07:40

1 Answers1

1

First, probably you want ggstance::geom_pointrangeh. Second, you could define colors by yAxis right at the beginning. To group some factors create a new variable group. Third is related to your data where you could assign factor labels. Fourth, remove coord_trans as suggested by @beetroot.

Assign factor labels

dat$yAxis <- factor(dat$yAxis, levels=10:1, labels=rev(boxLabels))

Create groups

dat$group <- 1
dat$group[which(dat$yAxis %in% c("Females/Males", "Undecided / STM", "non-STEM / STM",
                        "Pre-med / STM"))] <- 2
dat$group[which(dat$yAxis %in% c("Student-Centered Prac. (+1)",
                               "No previous calc / HS calc", 
                               "College calc / HS calc"))] <- 3

Colors

colors <- c("#860fc2", "#fc691d", "black")

Plot

library(ggplot2)
library(ggstance)
ggplot(dat, aes(x=boxOdds, y=yAxis, color=as.factor(group))) +
  geom_vline(aes(xintercept=1), size=0.75, linetype='dashed') +
  geom_pointrangeh(aes(xmax=boxCIHigh, xmin=boxCILow), size=.5, 
                   show.legend=FALSE) +
  geom_point(size=3.5, show.legend=FALSE) +
  theme_bw() +
  scale_color_manual(values=colors)+
  theme(panel.grid.minor=element_blank()) +
  scale_x_continuous(breaks=seq(0,7,1), limits=c(0, max(dat[2:4]))) +
  ylab('') +
  xlab('Odds Ratio') +
  annotate(geom='text', y =1.1, x=3.5, label ='', 
           size=3.5, hjust=0) + ggtitle('Estimated Odds of Switching') + 
  theme(plot.title=element_text(hjust=.5, size=20)) +
  theme(panel.grid.minor=element_blank(), panel.grid.major=element_blank())

Gives

enter image description here


Data

dat <- structure(list(yAxis = 10:1, boxOdds = c(2.23189, 1.315737, 1.22866, 
0.8197413, 0.9802449, 0.9786673, 0.6559005, 0.5929812, 0.6923759, 
1.3958275), boxCILow = c(0.7543566, 1.016, 0.9674772, 0.6463458, 
0.9643047, 0.864922, 0.4965308, 0.3572142, 0.4523759, 1.2023275
), boxCIHigh = c(6.603418, 1.703902, 1.560353, 1.039654, 0.9964486, 
1.107371, 0.8664225, 0.9843584, 0.9323759, 1.5893275)), class = "data.frame", row.names = c(NA, 
-10L))
jay.sf
  • 60,139
  • 8
  • 53
  • 110
  • Thank you for your reply but I still had some questions after your answers. For my second question, I want to assign same color for "Female/Males", "Undecided/STM", "non-STEM/STM", and "Pre-med/STM", instead of coloring all differently. For the third answer, what did you mean by assigning factor levels? What I meant by Question 4 was how do I make the space smaller from 0 to geom_vline. If you look at the "Given plot" and "My work", you will realize that the "Given plot" has much smaller space from 0 to the geom_vline. – hank Apr 01 '19 at 07:37