0

I have a series of lists describing duration (in days) of events, and I would like to plot this data as lines to compare the lists.

Below is some example data on what lunch options were served on which days at school. I have already parsed my data and this is the reduced form. Originally it was in the form of complex character strings.

soup = c(15:18)
grilledcheese = c(0:19)
pasta = c(3:13)

I want to create a graph similar to this one, with days on the x axis and soup, grilled cheese, and pasta on the y axis: enter image description here

I looked online and I'm not sure what kind of graph to use for this. Part of the difficulty is that the data does not start at 0 and the y axis should represent factors.

What I tried:

I tried plotting this in ggplot but it only takes data frames. I am wondering if there is a way to plot directly from lists. It seems like there should be a straightforward solution here that maybe I am missing.

I also tried this

plot(x = grilledcheese, y = rep(1, length(grilledcheese)))

which is closer to what I want, but I'm not sure how to plot multiple factors on the y axis.

Kelsey
  • 199
  • 6
  • What did you try? What step are you struggling with? Is there a reason the data is not in a data.frame? Could you provide a [minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610)? P.S. especially if this is a homework (and it *looks* a lot like one ;) you should show some previous work [see also here](https://meta.stackoverflow.com/questions/334822/how-do-i-ask-and-answer-homework-questions) – dario Oct 20 '21 at 12:58
  • 1
    This is actually not homework, although I can see why you think that. I am a PhD student plotting clinical data on the progression of disease in patients, so "grilledcheese" might actually correspond to "fever" or "headache." I started with some more convoluted character strings describing days symptoms were experienced and have reduced it to the lists of numbers described above. The image is of a graph published 20 years ago which I am trying to recreate with the updated data. I will provide some more info in the original post as well. Thanks for your response and any help you can provide! – Kelsey Oct 20 '21 at 13:21

2 Answers2

2

First, let's get your data in a shape easier to handle with ggplot2:

library(tidyverse)    
soup = c(15:18)
grilledcheese = c(0:19)
pasta = c(3:13)


df <- data.frame(soup_min = c(min(soup),max(soup)),
                 grilledcheese = c(min(grilledcheese),max(grilledcheese)),
                 pasta = c(min(pasta),max(pasta)))

df <- pivot_longer(df, cols = 1:3) %>% 
  group_by(name) %>% 
  mutate(minv = min(value),
         maxv = max(value)) %>% 
  ungroup() %>% 
  select(-value) %>% 
  distinct()

Data

# A tibble: 3 x 3
  name           minv  maxv
  <chr>         <int> <int>
1 soup_min         15    18
2 grilledcheese     0    19
3 pasta             3    13

Graph

We can then plot the different elements you want: the starting and ending dots for each line, the lines themselves and the axis theme.

ggplot(df) +
  geom_segment(aes(x = minv, xend = maxv, y = name, yend = name)) +
  geom_point(aes(x = minv, y = name)) +
  geom_point(aes(x = maxv, y = name)) +
  scale_x_continuous(breaks = c(0:20),
                     labels = c(0:20),
                     limits = c(0,20),
                     expand = c(0,0)) +
  theme(axis.ticks.x = element_line(size = 1),
        axis.ticks.y = element_blank(),
        axis.ticks.length =unit(.25, "cm"),
        axis.line.x = element_line(size = 1),
        panel.background = element_blank()) +
  labs(x = "",
       y = "")

We get this:

First_graph

This should do the trick.

Extra custom

Now, if you want to have the ticks labels in-between the ticks, you might want to check here because you will have to reshape your data, and get the graph done once you have all the food types you want. Until, I just add spacing with-in the labels :

ggplot(df) +
  geom_segment(aes(x = minv, xend = maxv, y = name, yend = name)) +
  geom_point(aes(x = minv, y = name)) +
  geom_point(aes(x = maxv, y = name)) +
  scale_x_continuous(breaks = c(0:20),
                     labels = paste("         ",0:20),
                     limits = c(0,20),
                     expand = c(0,0)) +
  theme(axis.ticks.x = element_line(size = 1),
        axis.ticks.y = element_blank(),
        axis.ticks.length =unit(.25, "cm"),
        axis.line.x = element_line(size = 1),
        panel.background = element_blank()) +
  labs(x = "",
       y = "")

Graph2

MonJeanJean
  • 2,876
  • 1
  • 4
  • 20
1

You will first need to engineer your data into a data frame. You could do, e.g.

soup = c(15:18)
grilledcheese = c(0:19)
pasta = c(3:13)

## make dataframe

library(tidyverse)
my_x_axis <- as_tibble(seq(0,20)) 
names(my_x_axis) <- 'x'
my_x_axis %>% mutate(soup_y = 1*ifelse(as.numeric(x %in% soup) == 1, 1, NA)) %>% 
              mutate(grilledcheese_y = 2*ifelse(as.numeric(x %in% grilledcheese) == 1, 1, NA)) %>% 
              mutate(pasta_y = 3*ifelse(as.numeric(x %in% pasta) == 1, 1, NA))  -> data

Here, I use the knowledge that your x axis values are between 0 and 20. You could also do choose them by, e.g. by min(c(soup,grilledcheese,pasta)) and min(c(soup,grilledcheese,pasta)) or some other logic.

Following the idea from this answer, I hard-code the y axis positions for the three foods as 1, 2, and 3.

The ggplot command reads as:

library(ggplot2)              
ggplot() + 
  geom_line(data=data, aes(x = x, y=soup_y)) +
  geom_line(data=data, aes(x = x, y=grilledcheese_y)) +
  geom_line(data=data, aes(x = x, y=pasta_y)) + 
  scale_y_discrete(labels = NULL, breaks = NULL) + labs(y = "") +  ## drop y axis labels 
  scale_x_continuous(labels=seq(0,20,1), breaks=seq(0,20,1)) + # x axis tick marks
  geom_text(aes(label = c('soup','grilledcheese','pasta'), x = 0, y = c(1,2,3), vjust = -.2,hjust=-.3)) # add labels

output

Otto Kässi
  • 2,943
  • 1
  • 10
  • 27