1

I'm learning how to use geom_line with facet_nested and I cannot produce a continuous line across facets and between each pair of facets there is a gap. What I try to do is to obtain a clean and continuous line as if I produced the very same graphic using Microsoft Excel. Before asking this question here, I searched and I found the following thread in StackOverflow :

How can I draw geom_line across facets or grid

Despite the appearance, this is not exactly what I'm looking for, given that the author doesn't really group the data as facet_nested does. So in order to better see what is my problem, I provide a test case below that you can easily copy/paste in RStudio (Windows environment) in order to understand where is the problem.

Here is the dataframe that I use for my graphic:

df_graph_data = data.frame(
    year = c(
        rep.int("2020", times = 11), 
        rep.int("2021", times = 12), 
        rep.int("2022", times = 3)
    ),
    month_name = c(
        "Feburary", "March", "April", "May", "June", "July",
        "August", "September", "October", "November", "December",
        "January", "Feburary", "March", "April", "May", "June", "July",
        "August", "September", "October", "November", "December",
        "January", "Feburary", "March"
    ),
    month_number = c(
        "02", "03", "04", "05", "06", "07",
        "08", "09", "10", "11", "12", "01",
        "02", "03", "04", "05", "06", "07",
        "08", "09", "10", "11", "12", "01",
        "02", "03"
    ),
    number_of_queries = c(
        484819L, 576697L, 843015L, 925175L,
        1102853L, 889212L, 835706L, 774622L,
        701338L, 850297L, 1046064L, 1273363L,
        958868L, 1088284L, 1151606L, 1666950L,
        2025731L, 2731704L, 2429019L, 3228395L,
        3204915L, 2612807L, 2811946L, 3053788L,
        2589273L, 2305433L
    )
)
###
###
### I add the following variable in order to be able to identify and reference
### each observation in my data.frame uniquely
df_graph_data$rownum = 1:nrow(df_graph_data)

And the graphic that I produced using tha above dataframe

library("tidyverse")
library("ggh4x")
options(device = "windows")
###
###
require(scales)
ggplot(data = df_graph_data) +
    geom_line(mapping = aes(
        x = rownum,
        y = number_of_queries
        ),
        size = 1,
        colour = "blue",
        linetype = "solid"
    ) + 
    scale_x_continuous(
        breaks = seq(
            min(df_graph_data$rownum),
            max(df_graph_data$rownum),
            by = 1L
        ),
        labels = df_graph_data$month_number
    ) +
    scale_y_continuous(
        limits = c(0, max(df_graph_data$number_of_queries) + 1000000L),
        breaks = seq(0, max(df_graph_data$number_of_queries) + 1000000L,
            by = 250000L
        ),
        labels = comma,
        expand = expansion(mult = 0, add = 0)
    ) +
    geom_point(mapping = aes(
        x = rownum,
        y = number_of_queries),
        shape = 15
    ) +
    facet_nested(
        ~ year,
        switch = "x",
        scales = "free_x",
        space = "free_x"
    ) +
    labs(
        x = "Month",
        y = "Number of clients queries",
        caption = "Statistics on queries"
    ) +
    theme(
        strip.placement = "outside",
        axis.text.x = element_text(angle = 0, vjust = 0.5, hjust=0.5),
        axis.title.y = element_text(margin = margin(t = 0, r = 15, b = 0, l = 0)),
        axis.title.x = element_text(margin = margin(t = 15, r = 0, b = 0, l = 0)),
        plot.title = element_text(hjust = 0.5, size = 12),
        plot.caption = element_text(size = 8, hjust = 1),
        plot.caption.position = "plot",
        panel.spacing.x=unit(0.0, "lines"),
        panel.spacing.y=unit(0.0, "lines")
    )

I join here the following screenshot that details more my problem

enter image description here

Is there any solution available in ggplot2 for the two problems indicated in the above screenshot in particular for having a continuous line across facets?

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
user17911
  • 1,073
  • 1
  • 8
  • 18

1 Answers1

2

It's possible to add annotations that "connect" multiple facets using grid or patchwork, but I think a simpler approach would be to avoid facets altogether and add the year boxes as a separate plot with the same x range.

I think this could also be done using annotations outside the plot range (e.g. with coord_cartesian(clip = "off")), but this method below worked for me faster.

enter image description here

library(patchwork); library(ggplot2); library(scales)

main_plot <- ggplot(data = df_graph_data) +
  geom_line(mapping = aes(
    x = rownum,
    y = number_of_queries
  ),
  size = 1,
  colour = "blue",
  linetype = "solid"
  ) + 
  
  scale_x_continuous(
    breaks = seq(
      min(df_graph_data$rownum),
      max(df_graph_data$rownum),
      by = 1L
    ),   # let's define x range in coord_cartesian instead 
    labels = df_graph_data$month_number,
    name = NULL
  ) +
  scale_y_continuous(
    limits = c(0, max(df_graph_data$number_of_queries) + 1000000L),
    breaks = seq(0, max(df_graph_data$number_of_queries) + 1000000L,
                 by = 250000L
    ),
    labels = comma,
    expand = expansion(mult = 0, add = 0)
  ) +
  geom_point(mapping = aes(
    x = rownum,
    y = number_of_queries),
    shape = 15
  ) +
  coord_cartesian(xlim = c(0,27), expand = FALSE) + # range should align between plots
  labs(
    x = "Month",
    y = "Number of clients queries"
  ) +
  theme(
    strip.placement = "outside",
    axis.text.x = element_text(angle = 0, vjust = 0.5, hjust=0.5),
    axis.title.y = element_text(margin = margin(t = 0, r = 15, b = 0, l = 0)),
    axis.title.x = element_text(margin = margin(t = 15, r = 0, b = 0, l = 0)),
    plot.title = element_text(hjust = 0.5, size = 12),
    plot.caption = element_text(size = 8, hjust = 1),
    plot.caption.position = "plot"
  ) 


# data for year labels
year_lab <- data.frame(
  from  = c(0,  11.5, 23.5),
  to    = c(11.5, 23.5, 27),
  lab   = 2020:2022,
  y_top = 2E5,  
  y_bot = 0E5
)

year_plot <- ggplot(year_lab) +
  geom_rect(fill = "gray80", color = "gray70",
            aes(xmin = from, xmax = to, ymin = y_bot, ymax = y_top)) +
  geom_text(aes((from+to)/2, y = (y_bot+y_top)/2, label = lab),
            vjust = 0.5, hjust = 0.5) +
  theme_void() +
  coord_cartesian(clip = "off", xlim = c(0,27), expand = FALSE) +
  labs(caption = "Statistics on queries") +
  theme(plot.margin = unit(c(0,0,0,0), "lines"))

main_plot / year_plot +
  plot_layout(ncol = 1, heights = c(10,0.5))
Jon Spring
  • 55,165
  • 4
  • 35
  • 53
  • Thank you very much for your help. I'm reading the documentation as there are several functions in your suggested solution that I hadn't seen before. I have a few questions about your solution : 1) What does the "/" character do in main_plot / year_plot? 2) What does "lines" mean in theme(plot.margin = unit(c(0,0,0,0), "lines")) ? 3) In the data.frame year_lab I can understand the meaning of the values you have chosen for 'from' and 'to' but I don't understand the values of 'y_top' and 'y_bot'. What is the reference according to which you have picked these values? – user17911 Apr 19 '22 at 16:01
  • 1
    `/` is from `patchwork`, and it means "put the next plot under the current one." The `unit` function can use a variety of measurement units, like cm, inches, npc, lines. The help for `?unit` defines these. The `y_top` and `y_bot` values are arbitrary remnants of an earlier approach. As long as they are different from each other I think the plot will turn out the same, since its size is driven by the `heights` argument of `plot_layout`. – Jon Spring Apr 19 '22 at 16:32