0

I have tabulated the results (OR, 95% CI) of logistic regression models and I want to show the odds ratio in a "forest plot-like" graph with zebra theme and sorted by sex (=group female or male).

structure(list(aOR = c(0.755657618643583, 1.62135330604067, 0.865398256316296, 
0.727203230474939, 2.11334255563662, 0.898407262562538), confidence_lower = c(0.613593844349365, 
1.35046132165669, 0.841880115298544, 0.6196490131542, 1.79796736338675, 
0.877221949869505), confidence_upper = c(0.930613046191133, 1.94658410489249, 
0.889573382749048, 0.853425934984244, 2.48403661179473, 0.920104210280172
), group = c("Female", "Female", "Female", "Male", "Male", "Male"
), labels = c("Age", "Diabetes", "BMI", "Age", "Diabetes", "BMI"
)), row.names = c(NA, -6L), class = "data.frame")

I started coding using ggplot. Thanks to another questions I found here, I discovered that in order to show the dots and bars grouped per sex, I need to factorize the labels.

This is my code:

#Factors
Variable_order <- c('Age', 'Diabetes', 'BMI')
df$labels = factor (df$labels, level=Variable_order)

#Define colours for dots and bars
dotCOLS = c("orchid2","dodgerblue2")
barCOLS = c("black","black")

#Plot
p <- ggplot (df, aes(x=aOR, xmin=confidence_lower, xmax=confidence_upper, y=labels, col=group, fill=group)) +
  geom_linerange(linewidth=0, position=position_dodge(width = 0.5)) +
  geom_vline(xintercept=1, lty=2) +
  geom_point(size=3, shape=21, stroke = 0, position=position_dodge(width = 0.5)) +
  scale_fill_manual(values=dotCOLS)+
  labs(x="Odds ratio") +
  scale_color_manual(values=barCOLS)+
  coord_cartesian(ylim=c(1,4), xlim=c(-2.5, 4.5)) +
  annotate("text", x=-1, y=4.5, label = "Reduce death") +
  annotate("text", x=3, y=4.5, label = "Increase death") +
  theme_classic() 
print(p)

and my graph: enter image description here

The result I want to achieve is this and as you can see I am far from it: enter image description here

I found a similar question here [https://stackoverflow.com/questions/15420621/reproduce-table-and-plot-from-journal/20266137#20266137] however my problem is that the csv dataset is not downloadable anymore and I don't know how to reproduce the code.

My problem is that I need to group the ORs per group and I didn't find a solution here. Please help. I want to use ggplot because it is more customizable compared to other existing packages.

user19745561
  • 145
  • 10
  • 1
    You could take a look at https://stackoverflow.com/questions/62246541/forest-plot-with-table-ggplot-coding/62312135#62312135 – Allan Cameron Mar 08 '23 at 18:48
  • @AllanCameron Thank you, I have already seen that answer but the problem is that I don't know how to divide per sex the odds ratio. In that example they are not divided as I showed in the image in the post – user19745561 Mar 08 '23 at 18:53

1 Answers1

1

You can use the group aesthetic and position_dodge

ggplot(df, aes(x = aOR, y = labels, group = group)) +
  geom_hline(yintercept = c("Age", "Diabetes"), linewidth = 50, 
             colour = "gray92") +
  geom_point(aes(shape = group), position = position_dodge(0.4), size = 3) +
  geom_errorbar(aes(xmin = confidence_lower, xmax = confidence_upper),
                 position = position_dodge(0.4), width = 0.05) +
  geom_vline(xintercept = 1) +
  annotate(geom = "text", x = c(0.2, 5), y = c(Inf, Inf), size = 6, fontface = 2,
           label = c("Reduce death", "Increase death"), vjust = 1.5) +
  scale_x_log10(breaks = c(0.1,0.2,  0.5, 1, 2, 5, 10), limits = c(0.1, 10)) +
  labs(y = NULL, x = "Odds ratio (log scale)") +
  theme_classic(base_size = 16) +
  theme(axis.line.y = element_blank())

enter image description here

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
  • is there a way to put in the labels Female and Male so instead of legend it is written in the Y axis as in the example above? – user19745561 Mar 08 '23 at 23:02