0

I have a dataset that includes presence absence of a variable based on hour and day. I'd like to plot presence (i.e., just "Y") by hour and day with day on the x and hour on the y axis. I'm just not good at this stuff and it's giving me trouble. See example data:

df <- data.frame("day"= c(1, 1, 1, 2, 2, 2, 3, 3, 3),"hour" = c(1, 2, 3, 1, 2, 2, 1, 2, 3),"a" = c("Y", "Y", "N", "N", "N", "Y", "N", "N", "Y"), "b" = c("N", "N", "Y", "N", "Y", "Y", "Y", "Y", "Y"))

Would love any suggestions.

brc
  • 99
  • 8

1 Answers1

1

You can try this approach :

Get the data in long format, filter only 'Y' values and plot scatterplot.

library(tidyverse)

df %>%
  pivot_longer(cols = a:b) %>%
  filter(value == 'Y') %>%
  mutate(across(c(day, hour), factor)) %>%
  ggplot() + aes(day, hour, color = name) + geom_point()
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • This worked really well. If you only wanted to plot one column (just "a") and you filter for "Y" then it drops extralimital hour values so the y-axis no longer has all 24 hours. Any suggestions for correcting this? – brc May 01 '21 at 13:06
  • You can use `scale_y_continuous` /`scale_y_discrete`. Something like this https://stackoverflow.com/questions/41917761/r-ggplot-getting-all-discrete-x-values-to-be-displayed-on-axis-in-histogram might work – Ronak Shah May 01 '21 at 13:45