1

I would like to have a plot that makes overlaps very clear. More specifically, I would like to plot where a given individual works. Because individuals may work at different workplaces at the same time, I will plot a geom_line for each one of the workplaces. My question is: how do I make it clear when there are overlaps? I tried using some transparency in geom_line() but am looking for something that makes overlaps stand out

You can see below a simple example of 3 individuals and 2 workplaces. Individual A switches from workplace 1 to workplace 2, individual B do the same but is unemployed for some time in-between and individual C works in both places during a short period (this is where I see the overlap).

# individual A
a_id <- c(rep('A',25))
a_period <- c(seq(1, 13), seq(13,24))
a_workplace <-c(rep(1,13), rep(2,12))

# individual B
b_id <- c(rep('B',19))
b_period <- c(seq(2,8), seq(13,24))
b_workplace <-c(rep(1,7), rep(2,12))

# individual C
c_id <- c(rep('C',9))
c_period <- c(seq(1,4), seq(2,6))
c_workplace <-c(rep(1,4), rep(2,5))

# final affiliation data
id <- c(a_id, b_id, c_id)
period <- c(a_period, b_period, c_period)
workplace <- c(a_workplace, b_workplace, c_workplace)
mydata <- data.frame(id, period, workplace)

# affiliation data by workplace
mydata_1 <- mydata %>%
  filter(workplace==1) %>%
  mutate(workplace=as.factor(workplace))
mydata_2 <- mydata %>%
  filter(workplace==2) %>%
  mutate(workplace=as.factor(workplace))

I tried the below, but would like still to make the overlaps clearer. Maybe a suggestion of combination of colours that makes overlaps clearer to see?

ggplot(mydata_1, aes(period, id, group=id, col=workplace)) +
  geom_line(alpha=0.4) +
  geom_line(data=mydata_2, alpha=0.4, aes(period, id, group=id, col=workplace)) +
  labs(x="time", y=NULL, title="Work affiliation") +
  scale_x_continuous(breaks = seq(0,24, by=2)) +
  scale_y_discrete(limits=rev) +
  scale_color_manual(values=c("dodgerblue","firebrick1")) +
  theme(legend.position = c(.7, .92), legend.direction = "horizontal",
        legend.background = element_rect(linetype="solid", colour ="black"), 
        panel.background = element_rect(fill = "grey97"))

I also don't understand why I can't see the right workplace colours in the legend.

PaulaSpinola
  • 531
  • 2
  • 10

1 Answers1

2

I think I would combine the data sets, then have a single geom_line call with a position_dodge. This simplifies your code, shows the overlaps, and displays your legend correctly.

all_data <- rbind(mydata_1, mydata_2)

ggplot(all_data, aes(x = id, y = period, color = workplace)) +
  geom_line(position = position_dodge(width = 0.1), size = 2) +
  labs(y = "time", title = "Work affiliation") +
  scale_y_continuous(breaks = seq(0, 24, by = 2)) +
  scale_x_discrete(limits = rev) +
  scale_color_manual(values = c("dodgerblue", "firebrick1")) +
  coord_flip() +
  theme(legend.position   = c(.7, .92), 
        legend.direction  = "horizontal",
        legend.background = element_rect(linetype = "solid", colour = "black"), 
        panel.background  = element_rect(fill = "grey97"))

enter image description here

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
  • May I just ask why you had `x=id, y=period` and then flipped it with `coord_flip()` instead of having `x=period, y=id` straight away? – PaulaSpinola Oct 16 '21 at 07:57
  • 1
    @PaulaSpinola this was just to allow the position dodge to work. – Allan Cameron Oct 16 '21 at 09:21
  • Thanks a lot @Allan Cameron! I have now transformed the data to consider the entire interval each individual works in each workplace. I tried to adapt the code you suggested here while using `geom_segment` instead of `geom_line` but I am having problems: https://stackoverflow.com/questions/69607718/plotting-geom-segment-with-position-dodge-in-r – PaulaSpinola Oct 17 '21 at 19:08