6

I would like to draw boxplots of relationship between a continuous and a categorical variables (geom_boxplot with ggplot2), this for several situations (facet_wrap). Quite easy:

data("CO2")
ggplot(CO2, aes(Treatment, uptake) ) + 
  geom_boxplot(aes(Treatment, uptake), 
               col="black", fill="white", alpha=0, width=.5) + 
  geom_point(col="black", size=1.2) + 
  facet_wrap(~Type, ncol=3, nrow=6, scales= "free_y") + 
  theme_bw() + 
  ylab("Uptake")

The result: enter image description here

This is quite nice with this toy dataset, but applied to my own data (where facet_wrap enables me to plot 18 different graphs) the y-axes are hardly readable, with varying number of y-ticks and varying spacing between them:

enter image description here

What could be a nice way to harmonize the y-axes? (i.e., getting equal spacing between y-axes ticks, no matter what breaks are -these will necessarily change from a graph to another because the variation range of my continuous variable changes a lot)

Thank you very much for any help :)

Z.Lin
  • 28,055
  • 6
  • 54
  • 94
Chrys
  • 313
  • 3
  • 10
  • Even spacing may result in crazy weird numbers on the axes. Is that what you want? It's unclear to me from your description exactly what the desired output would be for this input. It's also helpful to provide a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with your data that can be used to test possible solutions. – MrFlick Aug 30 '18 at 17:08
  • Question is too broad as final result is probably opinion based. – pogibas Aug 30 '18 at 17:11
  • Thank you very much MrFlick and PoGibas for your answers, and sorry for not being clear enough about my expectations. Thank you for the link to this post about reproducible examples, I am still not very comfortable about the best way to provide them! The answer provided by Z.Lin perfectly resolved my issue. Thanks you all! – Chrys Aug 31 '18 at 07:14

2 Answers2

8

You can turn force each facet's limits to something relatively nice looking, by manually expanding each facet's values through the application of pretty() on the y-axis values & taking the first / last values.

The following is an example using the diamonds dataset:

# normal facet_wrap plot with many different y-axis scales across facets
p <- ggplot(diamonds %>% filter(cut %in% c("Fair", "Ideal")), 
       aes(x = cut, y = carat) ) + 
  geom_boxplot(col="black", fill="white", alpha=0, width=.5) + 
  geom_point(col="black", size=1.2) + 
  facet_wrap(~clarity, scales= "free_y", nrow = 2) + 
  theme_bw() + 
  ylab("Uptake")

p

plot

# modified plot with consistent label placements
p + 
  # Manually create values to expand the scale, by finding "pretty" 
  # values that are slightly larger than the range of y-axis values 
  # within each facet; set alpha = 0 since they aren't meant to be seen
  geom_point(data = . %>% 
               group_by(clarity) %>% #group by facet variable
               summarise(y.min = pretty(carat)[1],
                         y.max = pretty(carat)[length(pretty(carat))]) %>%
               tidyr::gather(key, value, -clarity), 
             aes(x = 1, y = value),
             inherit.aes = FALSE, alpha = 0) +

  # Turn off automatical scale expansion, & manually set scale breaks
  # as an evenly spaced sequence (with the "pretty" values created above
  # providing the limits for each facet). If there are many facets to
  # show, I recommend no more than 3 labels in each facet, to keep things
  # simple.
  scale_y_continuous(breaks = function(x) seq(from = x[1], 
                                              to = x[2], 
                                              length.out = 3), 
                     expand = c(0, 0))

plot2

Z.Lin
  • 28,055
  • 6
  • 54
  • 94
  • Thanks a lot Z.Lin, this is exactly what I needed! All the best! – Chrys Aug 31 '18 at 07:15
  • This also worked for me, as I had the exact same need. I consider myself an intermediate R user, but I cannot figure out why this works. – acircleda May 15 '20 at 18:11
0

Just remove scales= "free_y" inside geom_point, you should get what you want.

However, it has been rightly pointed out by MrFlick in comments, even spacing will most definitely result in crazy weird numbers on the axes

Nitish Sahay
  • 306
  • 5
  • 14
  • Thank you very much nitish.s, unfortunately removing the scales="free_y" would force y-axes to be identical across each facets, which is not what I want because I have very different values from one facet to another (and need all of them to be easily readable and not "compressed"). Have a great day! – Chrys Aug 31 '18 at 07:18