1

I am working on visual representations of single case studies. I need to make some changes to my graph in ggplot2, but I found this to be a bit challenging. Here is a brief description of the variables contained in the toy data set that I used to make a reproducible example:

  • Occasion: Number of the session rater evaluated the behavior (from 1 to n);
  • Time: Number of each condition (baseline from 1 to n and intervention from 1 to n);
  • Phase: Condition (A = baseline or B = intervention);
  • ID: student code in the study;
  • Outcome: total score on a behavioral checklist.
library(dplyr)
library(ggplot2)

db_tb <- read.table(header = TRUE, text = '
Occasion Time Phase ID outcome
      1    1     A  1       15
      2    2     A  1       14
      3    3     A  1        8
      4    4     A  1       10
      5    5     A  1       14
      6    6     A  1        8
      7    7     A  1       10
      8    1     B  1       21
      9    2     B  1       23
     10    3     B  1       24
     11    4     B  1       20
     12    5     B  1       25
     13    6     B  1       15
     14    7     B  1       11
     15    8     B  1       23
     16    9     B  1       20
     17   10     B  1       NA
     18   11     B  1       15
     19   12     B  1       20
     20   13     B  1       NA
     21   14     B  1       16
      1    1     A  2       18
      2    2     A  2       14
      3    3     A  2       18
      4    4     A  2       21
      5    1     B  2        8
      6    2     B  2       NA
      7    3     B  2       10
      8    4     B  2       17
      9    5     B  2       NA
     10    6     B  2       29
      1    1     A  3       15
      2    2     A  3        7
      3    3     A  3       14
      4    1     B  3       15
      5    2     B  3       14
      6    3     B  3       11
      7    4     B  3       10
      8    5     B  3       NA
      9    6     B  3       NA
      10   7     B  3        7
      11   8     B  3        9
      12   9     B  3       13
      13  10     B  3       11
')

In order to separate the baseline and intervention, I created vlines_tb that gives me a table with the session number after which I will set the vertical line in ggplot2.

#table containing the last day of the baseline phase
vlines_tb <- db_tb %>% 
  filter(Phase == "A") %>% 
  group_by(ID, Phase) %>%
  summarise(y = max(Occasion))

Finally I created the graph consistently with other papers in the field.

#create a visual representation
db_tb %>% 
  na.omit(outcome) %>%  #Interpolate missing data so all markers within phase are connected 
  ggplot(aes(x = Occasion, y = outcome, group = Phase)) + 
  geom_point(size = 1.8) + 
  geom_line(size = 0.65) +
  ggtitle("Baseline") +
  facet_grid(ID ~ .) +
  scale_x_continuous(name = "Occasions", breaks = seq(0, 70, 5)) +
  scale_y_continuous(name = "Rating", limits = c(0, 30)) +
  theme_classic() +
  theme(strip.background = element_blank(),
        axis.title.x = element_text(margin = margin(t = 20, r = 0, b = 0, l = 0)),
        axis.title.y = element_text(margin = margin(t = 0, r = 20, b = 0, l = 0))) +
  annotate("segment", x = -Inf, xend = Inf, y = -Inf, yend = -Inf) +
  geom_vline(data = vlines_tb, aes(xintercept = y + 0.5), colour = "black", linetype = "dashed")

enter image description here

However, I would like to make a couple of changes:

  • Sort the subjects based on when the intervention was implemented so that the subject on the top would be the one who received the intervention first, etc.; and
  • Re-label the IDs so that the first one from the top represented in the graph would be renamed as 1, etc.;
  • Label baseline and intervention conditions at the top of the chart (they should be above the two corresponding areas).

I made the changes through Excel to show what the final outcome should look like (see below). Thanks for any help! enter image description here

Michael Matta
  • 394
  • 2
  • 16

1 Answers1

1

See if this works for you. Explanations in annotations within:

db_tb %>%
  na.omit(outcome) %>%

  # calculate the interventing timing for each ID
  group_by(ID) %>%
  mutate(intervention.timing = max(Occasion[Phase == "A"])) %>%
  ungroup() %>% 

  # convert IDs to factors ordered according to their intervention timing
  mutate(ID = forcats::fct_reorder(factor(ID), intervention.timing)) %>%

  # create new IDs based on order of original ID's levels
  mutate(new.ID = as.integer(ID)) %>%

  # define label position for each ID
  group_by(ID, Phase) %>%
  mutate(label.x = mean(Occasion)) %>%
  ungroup() %>%
  mutate(label = ifelse(Phase == "A", "Baseline", "Intervention"), 
         label.y = max(outcome)) %>%

  ggplot(aes(x = Occasion, y = outcome)) +
  geom_point(size = 1.8) + 
  geom_line(aes(group = Phase), 
            size = 0.65) +
  geom_vline(data = . %>% select(new.ID, intervention.timing) %>% unique(),
             aes(xintercept = intervention.timing + 0.5),
             linetype = "dashed") +
  # only show phase label in top facet
  geom_text(data = . %>% select(new.ID, Phase, label.x, label.y, label) %>% unique(),
            aes(x = label.x, y = label.y, label = label,
                alpha = ifelse(new.ID == min(new.ID), 1, 0)),
            vjust = 1, fontface = "bold") +
  annotate("segment", x = -Inf, xend = Inf, y = -Inf, yend = -Inf) +

  facet_grid(new.ID ~ .) +
  scale_x_continuous(name = "Occasions", breaks = seq(0, 70, 5)) +
  scale_y_continuous(name = "Rating", limits = c(0, 30)) +
  scale_alpha_identity() +
  theme_classic() +
  theme(strip.background = element_blank())

result

Z.Lin
  • 28,055
  • 6
  • 54
  • 94
  • This is a great answer! I would like to follow up about two aspects: 1) Do you happen to know a way to change the numbers of the subjects before plotting the data? My real data set is larger than this one and I would need to keep the number of subject consistent across different graphs; 2) Do you know if it is possible to put the labels outside the graph? – Michael Matta Sep 27 '19 at 21:55
  • 1
    For 1), you may want to filter subjects by specific ID / top X subjects / etc. before passing the data into ggplot. For 2), see if [this](https://stackoverflow.com/questions/50201928/avoid-ggplot2-to-partially-cut-axis-text/50202854) is relevant for you? – Z.Lin Sep 29 '19 at 05:30
  • Thanks for your help. I was able to filter the subjects, but facets_grid brings back the order of the cases. I posted a new question here: https://stackoverflow.com/questions/58174787/force-facet-grid-to-plot-facets-in-the-same-order-as-they-appear-in-the-data-set As for the second question, it does not seem to be my case. I would like to plot something outside the graph, whereas that question was about labels that are cut. – Michael Matta Sep 30 '19 at 20:20