4

I am doing a scatterplot with a facet_grid() like that:

library(ggplot2)
ggplot(df, aes(x, y)) +
  geom_point() +
  facet_grid(group1 ~ group2)

I want the y axis title y to be in the middle of each row like this (paint solution):

Desired plot

The numbers of facet rows is two in this example because df$group2 has two different values. For my actual use case there may be more than two rows depending on the used facet variable; the y axis title is supposed to be in the middle of each facet row.

Best solution so far is adding spaces which is a mess since using y axis titles of different length shifts the text away from the middle of the rows. It must be with ggplot2, i.e. without the usage of additional packages. I make a package and do not want to rely on/ include too many packages.

Data used here:

df <- data.frame(x= rnorm(100), y= rnorm(100),
                 group1= rep(0:1, 50), group2= rep(2:3, each= 50))

4 Answers4

3

Without using another package, I felt that the best method would be to build upon the spaces solution you linked in the original question. So I wrote a function to make the label spacing a little bit more robust.

ylabel <- function(label1,label2){
  L1 <- nchar(label1)
  L2 <- nchar(label2)
  scaler <- ifelse(L1 + L2 > 8, 4, 0)
  space1 = paste0(rep("",27 - (L1/2)),collapse = " ")
  space2 = paste0(rep("",44 - (L1/2 + L2/2) - scaler), collapse = " ")
  space3 = paste0(rep("",22 - (L2/2)), collapse = " ")
  paste0(space1,label1,space2,label2,space3)
}

Application:

test <- ylabel("automobiles", "trucks")
ggplot(df, aes(x, y)) +
  geom_point() +
  facet_grid(group1 ~ group2) +
  ylab(test)

plot1

Still playing around with the scaler parameter, it's not perfect:

test2 <- ylabel("super long label", "a")
ggplot(df, aes(x, y)) +
  geom_point() +
  facet_grid(group1 ~ group2) +
  ylab(test2)

plot2

Will continue to refine the function/parameters, but am thinking this will get you close to what you're looking for.

NovaEthos
  • 500
  • 2
  • 10
  • I like that, but in its current form, it’s not scalable - would not work with more than two rows. Also, you should try hard not to name any object after base R functions, and ‘c’ is probably the worst choice of all – tjebo Nov 23 '21 at 12:04
  • 1
    @tjebo your points are valid. I have adjusted my object names accordingly. Also, I admit that I overlooked the OP's requirement to have a flexible number of rows. That adds another layer onto this tricky problem. – NovaEthos Nov 23 '21 at 13:16
3

You can copy the axis labels into new grobs in the gtable. Note that although this uses the grid and gtable packages, these are already imported by ggplot2, so this does not add any new dependencies that are not already available and used internally by ggplot.

library(grid)
library(gtable)

g = ggplot(df, aes(x, y)) +
  geom_point() +
  facet_grid(group1 ~ group2)

gt = ggplot_gtable(ggplot_build(g))
which.ylab = grep('ylab-l', gt$layout$name)
gt = gtable_add_grob(gt, gt$grobs[which.ylab], 8, 3)
gt = gtable_add_grob(gt, gt$grobs[which.ylab], 10, 3)
gt = gtable_filter(gt, 'ylab-l', invert = TRUE) # remove the original axis title
grid.draw(gt)

enter image description here

The above works for OP's example with just two facets. If we want to generalise this for an arbitrary number of facets we can do that simply enough by searching the gtable to see which rows contain y-axes.

gt = ggplot_gtable(ggplot_build(g))
which.ylab = grep('ylab-l', gt$layout$name)
which.axes = grep('axis-l', gt$layout$name)
axis.rows  = gt$layout$t[which.axes]
label.col  = gt$layout$l[which.ylab]
gt = gtable::gtable_add_grob(gt, rep(gt$grobs[which.ylab], length(axis.rows)), axis.rows, label.col)
gt = gtable::gtable_filter  (gt, 'ylab-l', invert = TRUE) 
grid::grid.draw(gt)

In the version above, I also use :: to explicitly specify the namespace for the functions from the grid and gtable packages. This will allow the code to work without even loading the additional packages into the search path.

Demonstrating this code with another example with four facet rows:

df <- data.frame(x= rnorm(100), y= rnorm(100),
                 group1= rep(1:4, 25), group2= rep(1:2, each= 50))

enter image description here

dww
  • 30,425
  • 5
  • 68
  • 111
  • 1
    Really like that solution. Regarding dependencies though -(I personally would not mind adding a few dependencies more)- I think even if ggplot2 imports the entire namespace of those packages, you would still need to import directly from those packages if you make explicit use of those functions, thus also adding it to your dependencies in the description. – tjebo Nov 23 '21 at 21:43
  • 3
    Yes, you still need to import them but I don't understand what downside that can have, considering they must already be installed. – dww Nov 23 '21 at 23:13
  • "These are already imported by ggplot2" Does this mean that anyone who has installed `ggplot2` can run the code after `library(grid); library(gtable)`? –  Nov 24 '21 at 12:30
  • Yes it does. Although if you want to use the code inside a package, you would do it slightly differently. In packages, rather than use the `library` function, you instead list the packages to import in a file called DESCRIPTION (see for example [here](https://r-pkgs.org/namespace.html#imports)). If you are using ggplot2 in your package, you should already be doing that to access the `ggplot` function. – dww Nov 24 '21 at 12:53
  • 1
    Oh no, I gave the bounty by mistake to the other answer because it was shown at the top. I expected the accepted answer to be at the top. Can't undo.. –  Nov 30 '21 at 12:40
  • No worries. I expect you made @susanswitzer very happy. – dww Nov 30 '21 at 13:39
2

You may consider switching to library(cowplot) for more control

The following code could be added to a function, but I left it long for clarity. Create 4 dataframes and feed them to four plots. Then arrange the plots

library(tidyverse)
df <- data.frame(x= rnorm(100), y= rnorm(100),
                 group1= rep(0:1, 50), group2= rep(2:3, each= 50))


library(cowplot)
df1 <- df %>% 
  filter(group2 == 2) %>% 
         filter(group1 == 0)

df2 <- df %>% 
  filter(group2 == 3) %>% 
  filter(group1 == 0)

df3 <- df %>% 
  filter(group2 == 2) %>% 
  filter(group1 == 1)

df4 <- df %>% 
  filter(group2 == 3) %>% 
  filter(group1 == 1)

plot1 <- ggplot(df1, aes(x, y)) +
  geom_point() +
  facet_grid(group1 ~ group2)+
  xlim(c(-3, 3))+
  ylim(c(-3, 2))+
  theme(strip.text.y = element_blank(), 
        axis.title.x = element_blank(), 
        axis.text.x = element_blank(), 
        axis.ticks.x = element_blank()
        )
plot1


plot2 <- ggplot(df2, aes(x, y)) +
  geom_point() +
  facet_grid(group1 ~ group2)+
  xlim(c(-3, 3))+
  ylim(c(-3, 2))+
  theme(axis.title.y = element_blank(), 
        axis.text.y = element_blank(), 
        axis.ticks.y = element_blank(), 
        axis.title.x = element_blank(), 
        axis.text.x = element_blank(), 
        axis.ticks.x = element_blank()
        )
plot2


plot3 <- ggplot(df3, aes(x, y)) +
  geom_point() +
  facet_grid(group1 ~ group2)+
  xlim(c(-3, 3))+
  ylim(c(-3, 2))+
  theme(strip.text.x = element_blank(),
        strip.text.y = element_blank())
plot3


plot4 <- ggplot(df4, aes(x, y)) +
  geom_point() +
  facet_grid(group1 ~ group2)+
  xlim(c(-3, 3))+
  ylim(c(-3, 2))+
  theme(axis.title.y = element_blank(), 
        strip.text.x = element_blank(),
        axis.text.y = element_blank(), 
        axis.ticks.y = element_blank())
plot4

plot_grid(plot1, plot2, plot3, plot4)

plotgrid

Susan Switzer
  • 1,531
  • 8
  • 34
2

Here is a version with annotation, using ggplot2 only. It should be scalable.

No messing with grobs. The disadvantage is that the x positioning and the plot margins need to be semi-manually defined and this might not be very robust.

library(ggplot2)

df <- data.frame(x= rnorm(100), y= rnorm(100),
                 group1= rep(0:1, 50), group2= rep(2:3, each= 50))

## define a new data frame based on your groups, so this is scalable
annotate_ylab <- function(df, x, y, group1, group2, label = "label") {
  ## make group2 a factor, so you know which column will be to the left
  df[[group2]] <- factor(df[[group2]])
  lab_df <- data.frame( 
    ## x positioning is a bit tricky,
    ## I think a moderately robust method is to
    ## set it relativ to the range of your values
    x = min(df[[x]]) - 0.2 * diff(range(df[[x]])),
    y = mean(df[[y]]),
    g1 = unique(df[[group1]]),
    ## draw only on the left column
    g2 = levels(df[[group2]])[1],
    label = label
  )
  names(lab_df) <- c(x, y, group1, group2, "label")
  lab_df
}

y_df <- annotate_ylab(df, "x", "y", "group1", "group2", "y")

ggplot(df, aes(x, y)) +
  geom_point() +
  geom_text(data = y_df, aes(x, y, label = label), angle = 90) +
  facet_grid(group1 ~ group2) +
  coord_cartesian(xlim = range(df$x), clip = "off") +
  theme(axis.title.y = element_blank(), 
        plot.margin = margin(5, 5, 5, 20))

y_df_mtcars <- annotate_ylab(mtcars, "mpg", "disp", "carb", "vs", "y")

ggplot(mtcars, aes(mpg, disp)) +
  geom_point() +
  geom_text(data = y_df_mtcars, aes(mpg, disp, label = label), angle = 90) +
  facet_grid(carb ~ vs) +
  coord_cartesian(xlim = range(mtcars$mpg), clip = "off") +
  theme(axis.title.y = element_blank(), 
        plot.margin = margin(5, 5, 5, 20))

Created on 2021-11-24 by the reprex package (v2.0.1)

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
tjebo
  • 21,977
  • 7
  • 58
  • 94
  • 1
    The idea is great (+1). Unfortunately, as you mention, it is not robust. I tried `df <- mtcars; df$x <- df$mpg; df$y <- df$disp; df$group1 <- as.factor(df$carb); df$group2 <- as.factor(df$vs)` as new data and in the resulting plot the y label gets eaten by the y values. –  Nov 24 '21 at 12:28
  • @machine I found a few minutes :) I think it should be much more robust when positioning relative to the range of your values - see my update. – tjebo Nov 24 '21 at 18:02
  • I think this is almost done. What I noticed is that the scale also must be adjusted to the number of column facets. Using a variable with more levels than `vs` as `group2` again shifts the y labels in the y values. Try it with `y_df_mtcars <- annotate_ylab(mtcars, "mpg", "disp", "carb", "cyl", "y") ... facet_grid(carb ~ cyl) + ...`, for example. I changed `x` to `min(df[[x]]) - 0.25 * diff(range(df[[x]])) * (length(levels(df[[group2]]))*.4)` which seems to work here. Not tried other data yet. –  Nov 24 '21 at 21:54
  • 1
    I guess the solution is flawed in exactly that way that there will always be a case where it will fail to work… Guess it depends on what you expect of how your users are most likely to use it. Can be tricky to think of all use cases. I think dww’s solution is most robust, even if the internal grob structure has changed in the past, and there is no reason to think that won’t happen in the future, even so I guess it won’t need much adjustment to update your package accordingly – tjebo Nov 25 '21 at 13:39