1

For this illustrative data:

set.seed(123)
df <- data.frame(
  ID = 1:50,
  Q = rnorm(50),
  A = rnorm(50,1)
)

I want to connect the paired points across ID, which is possible with the help of this answer Connect jittered points by group:

pj <- position_jitter(seed = 1, width = .1, height = 0)

df %>%
  pivot_longer(-ID) %>%
  ggplot(aes(x = factor(name), y = value, fill = factor(name))) +

  # boxplot:
  geom_boxplot(
    width = 0.12,
    outlier.color = NA,
    alpha = 0.5
  ) +
  
  # data points:
  geom_point(
    alpha = 0.5, col = "blue",
    position = pj
  ) +
  
  # connecting lines:
  geom_path(aes(group = ID),
            alpha = 0.5,
            position = pj
  )

enter image description here

What bothers me is that the points overplot the boxplots. I would like them to be separated from the boxplots. Specifically they should be moved inside the space between the boxes, like in this plot:

enter image description here

How can this be achieved? Many thanks in advance.

Chris Ruehlemann
  • 20,321
  • 4
  • 12
  • 34
  • Can you make a pseudo- factor "A" and "Q" for the boxplots so the x-axis factor is (for example) `c("A_bp", "A", "Q", "Q-bp")`? Edit: `"_bp"` is meant as the factors for just the boxplots whereas `"A"` and `"Q"` are the standard values. Then to adjust the x-axis to suit you should be able to change `breaks` within `scale_x_axis_manual`? – JudgedGem Apr 04 '23 at 09:39

1 Answers1

2

One option would be to shift or nudge the positions of the points and the lines which requires to convert name to a numeric and to take account of the boxplot and the jitter width:

library(tidyr)
library(dplyr, warn = FALSE)
library(ggplot2)

box_width <- .12
jitter_width <- .1

pj <- position_jitter(seed = 1, width = jitter_width, height = 0)

df %>%
  pivot_longer(-ID) %>%
  mutate(
    name_num = as.numeric(factor(name)),
    name_num = name_num + (box_width + jitter_width / 2) * if_else(name == "A", 1, -1)
  ) |>
  ggplot(aes(x = factor(name), y = value, fill = factor(name))) +
  geom_boxplot(
    width = box_width,
    outlier.color = NA,
    alpha = 0.5
  ) +
  geom_point(
    aes(x = name_num),
    alpha = 0.5, col = "blue",
    position = pj
  ) +
  geom_path(aes(x = name_num, group = ID),
    alpha = 0.5,
    position = pj
  )

enter image description here

EDIT To switch the positions of the name categories set the levels in your desired order when converting to a factor. And of course do we have to take account of that in the ifelse.

library(tidyr)
library(dplyr, warn = FALSE)
library(ggplot2)

box_width <- .12
jitter_width <- .1

pj <- position_jitter(seed = 1, width = jitter_width, height = 0)

df %>%
  pivot_longer(-ID) %>%
  mutate(
    name = factor(name, levels = c("Q", "A")),
    name_num = as.numeric(factor(name)),
    name_num = name_num + (box_width + jitter_width / 2) * if_else(name == "Q", 1, -1)
  ) |>
  ggplot(aes(x = name, y = value, fill = name)) +
  geom_boxplot(
    width = box_width,
    outlier.color = NA,
    alpha = 0.5
  ) +
  geom_point(
    aes(x = name_num),
    alpha = 0.5, col = "blue",
    position = pj
  ) +
  geom_path(aes(x = name_num, group = ID),
    alpha = 0.5,
    position = pj
  )

enter image description here

stefan
  • 90,330
  • 6
  • 25
  • 51
  • Awesome, but is it really that complicated? – Chris Ruehlemann Apr 04 '23 at 09:49
  • Last question: how can I change the order of `A` and `Q` on the x-axis so that the `Q` data are plotted first, i.e. on the left, and the `A`data second, i.e., on the right? – Chris Ruehlemann Apr 04 '23 at 09:52
  • Haha. Would be interested to see if there is an easier option using just plain ggplot2. Perhaps there is a package which offers an out-of-the-box solution. For your second question: See my edit. – stefan Apr 04 '23 at 10:01