1

I have a question concerning ordering of stacked bars in a swimmer plot using GGplot in R.

I have a sample dataset of (artificial) patients, who receive treatments.

library(tidyverse)

df <- read.table(text="patient start_t_1 t_1_duration start_t_2 t_2_duration start_t_3 t_3_duration start_t_4 t_4_duration end
                 1    0    1.5    1.5   3   NA    NA    4.5    10   10
                 2    0    2    4.5    2    NA    NA    2   2.5   10
                 3    0    5    5   2   7   0.5   7.5   2   9.5
                 4    0    8    NA    NA    NA    NA    8   2   10", header=TRUE)

All patients start the first treatment at time = 0. Subsequently, patients get different treatments (numbered t_2 up to t_4).

I tried to plot the swimmer plot, using the following code:

df %>% 
  gather(variable, value, c(t_1_duration, t_2_duration, t_3_duration, t_4_duration)) %>% 
  ggplot(aes(x = patient, y = value, fill = variable)) + 
  geom_bar(stat = "identity") +
  coord_flip()

However, the treatments are not displayed in the right order. For example: patient 3 receives all treatments in consecutive orde, while patient 2 receives first treatment 1, then 4 and eventually 2. So, simply reversing the order does not work.

How do I order the stacked bars in a chronological way?

user213544
  • 2,046
  • 3
  • 22
  • 52

1 Answers1

2

What about this:

df %>% 
  gather(variable, value, c(t_1_duration, t_2_duration, t_3_duration,t_4_duration)) %>% 
  ggplot(aes(x = patient,
             y = value,
             # here you can specify the order of the variable
             fill = factor(variable, 
                          levels =c("t_4_duration", "t_3_duration", "t_2_duration","t_1_duration")))) + 
  geom_bar(stat = "identity") +
  coord_flip()+ guides(fill=guide_legend("My title")) 

enter image description here

EDIT: that has been a long trip, because it involves a kind of hack. I think it's not not a dupe of that question, because it involves also some data reshaping:

library(reshape2)

# divide starts and duration
starts <- df %>% select(patient, start_t_1, start_t_2, start_t_3, start_t_4) 
duration <- df %>% select(patient, t_1_duration,t_2_duration, t_3_duration, t_4_duration)

# here you melt them
starts <- melt(starts, id = 'patient')  %>%
  mutate(keytreat = substr(variable,nchar(as.vector(variable))-2, nchar(as.vector(variable)))) %>%
  `colnames<-`(c("patient", "variable", "start","keytreat")) %>% select(-variable)
duration <- melt(duration, id = 'patient')  %>% mutate(keytreat = substr(variable,1, 3)) %>%
  `colnames<-`(c("patient", "variable", "duration","keytreat")) %>% select(-variable)

# join
dats <- starts %>% left_join(duration) %>% arrange(patient, start) %>% filter(!is.na(start))


# here the part for the plot
bars <- map(unique(dats$patient)
            , ~geom_bar(stat = "identity", position = "stack"
                        , data = dats %>% filter(patient == .x)))

dats %>% 
  ggplot(aes(x = patient,
             y = duration,
             fill = reorder(keytreat,-start))) + 
  bars +
  guides(fill=guide_legend("ordering"))  + coord_flip()

enter image description here

user213544
  • 2,046
  • 3
  • 22
  • 52
s__
  • 9,270
  • 3
  • 27
  • 45
  • It is closer to a solution, then I was myself, so thanks:)! However, the ordering is not yet chronological. For example: patient two receives first treatment 1, then treatment 4 and last treatment 2, while the colors in the graph correspond to treatments 1 -> 2 -> 4. so the order of treatments is based on the time at which treatment is started. Any idea how to incorporate this? – user213544 Dec 03 '18 at 11:43