1

I am trying to graph a temperature dataset using mean, max, and min temps by month over 2 years. The graph includes two horizontal temperature thresholds.

I have succeeded in creating a graph, but I want to add labels "9.9" and "12.97" to my 2 horizontal threshold lines, and am having trouble I think because the x-axis is a date.

Here is the dput() sample of my data (hob_m_cs1_sort):

structure(list(year = c(2021, 2021, 2021, 2021), month = c(2, 
3, 4, 5), tmin_mean = c(10.625, 8.27870967741936, 7.78666666666667, 
9.34225806451613), tmax_mean = c(15.255, 15.8003225806452, 16.869, 
18.6835483870968), tmean = c(12.3655534638554, 11.5371012544803, 
11.9291921296296, 13.5006406810036), date = structure(c(18659, 
18687, 18718, 18748), class = "Date"), month_name = c("Feb", 
"Mar", "Apr", "May")), row.names = c(NA, 4L), class = "data.frame")`

This is the code I have been using:

hob_m_cs1_sort %>% group_by(date) %>%
  summarise(min = min(tmin_mean, na.rm = TRUE),
            max = max(tmax_mean, na.rm = TRUE),
            avg = mean(tmean,na.rm = TRUE)) %>%
  gather(metric, value, -date) %>%
  ggplot(.,aes(x = date, y = value, 
               group = metric, color = metric)) + 
  labs(color='Temperature') +
  ggtitle ("Hakalau Monthly Temperatures: Pua 'Akala, 1510 m") +
  theme(plot.title = element_text(hjust = 0.5)) +
  xlab("Date") +  ylab ("Temperature ( ºC )") +
  scale_y_continuous(limits = c(2.5, 22.5), breaks = seq(5, 25, by = 5)) +
  scale_x_date(date_breaks = "2 months", date_labels = "%b %Y") +
  theme_ipsum() +
  theme(axis.text.x=element_text(angle=60, hjust=1)) +
  geom_line(aes(color = metric)) + 
  geom_hline(aes(yintercept=h, linetype = "Culex development"), colour= 'darkorange1') +
  geom_hline(aes(yintercept=h2, linetype = "Avian malaria development"), colour= 'red') +
  scale_linetype_manual(name = "Temperature Thresholds", values = c(2, 2), 
                        guide = guide_legend(override.aes = list(color = c("red", "darkorange1")))) +
  scale_color_manual(values = c("steelblue1", "navyblue", "darkturquoise"), breaks=c('max', 'avg', 'min'), labels=c('Max', 'Avg', 'Min'))

I am able to produce this graph, but no labels on the thresholds: link below

I have tried these options but they are not producing labels for me:

geom_text(aes(0, h, label = h, vjust = - 1)) +
  geom_text(aes(0, h2, label = h2, vjust = - 1)) +

geom_text(aes("2021-02-01", h, label = h)) +
  geom_text(aes("2021-02-01", h2, label = h2)) +
  
annotate(y= 9.9, x = dmy("01/02/2021"), label="Normal Limit", geom = "label")

Please help! Thanks :)

steph
  • 13
  • 4
  • 3
    Welcome to SO! Your approach with `annotate` looks fine as in that case you used a proper date. But without any data to run your code one can only guess what might be the issue. Hence, to help you any further we need [a minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) including a snippet of your data or some fake data. – stefan Jan 12 '23 at 22:50
  • What does "have not worked" mean specifically? This simplified example worked for me, but maybe it's not what you are looking for. Some more explanation might help. `ggplot(data.frame(date = as.Date("2021-01-01") + 1:100, val = rpois(100, 100))) + geom_point(aes(date, val)) + annotate(y= 100, x = lubridate::dmy("01/02/2021"), hjust = 0, label="Normal Limit", geom = "label")` – Jon Spring Jan 13 '23 at 01:26
  • Please provide enough code so others can better understand or reproduce the problem. – Community Jan 13 '23 at 01:27
  • Thanks so much for your input, and apologies for the lack of details! I added the first 4 lines of my dataset, and have clarified your questions. I still cannot get the labels to appear on the threshold lines. – steph Jan 18 '23 at 04:11

1 Answers1

0

You need to remind R that you're dealing with dates. You can use lubridate::as_date. I've removed a good deal of code that wasn't necessary for the problem.

  • May I suggest using vectors for annotation instead, thus you will need only one call to annotate.
  • May I suggest the geomtextpath package and direct labelling of your lines with a proper label and not the value. Why? The value is already represented by the very height of the line. And the direct label will make it easier for the reader to understand the meaning of the line.

Smaller comments / suggestions in the code

library(tidyverse)
library(lubridate)
#> Loading required package: timechange
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union
library(geomtextpath)
hob_m_cs1_sort <- structure(list(year = c(2021, 2021, 2021, 2021), month = c(2, 
                                                           3, 4, 5), tmin_mean = c(10.625, 8.27870967741936, 7.78666666666667, 
                                                                                   9.34225806451613), tmax_mean = c(15.255, 15.8003225806452, 16.869, 
                                                                                                                    18.6835483870968), tmean = c(12.3655534638554, 11.5371012544803, 
                                                                                                                                                 11.9291921296296, 13.5006406810036), date = structure(c(18659, 
                                                                                                                                                                                                         18687, 18718, 18748), class = "Date"), month_name = c("Feb", 
                                                                                                                                                                                                                                                               "Mar", "Apr", "May")), row.names = c(NA, 4L), class = "data.frame")


h <- 9.9
h2 <- 12.97

## I like to store as a proper data frame if more than one manipulation step 
hob_long <- hob_m_cs1_sort %>% group_by(date) %>%
  summarise(min = min(tmin_mean, na.rm = TRUE),
            max = max(tmax_mean, na.rm = TRUE),
            avg = mean(tmean,na.rm = TRUE)) %>%
  gather(metric, value, -date)

ggplot(hob_long, aes(x = date, y = value, group = metric, color = metric)) + 
  ## removed aes, as specified in main ggplot call 
  geom_line() + 
  geom_hline(aes(yintercept=h, linetype = "Culex development"), colour= 'darkorange1') +
  geom_hline(aes(yintercept=h2, linetype = "Avian malaria development"), colour= 'red') +
  ## do both in one call, use vectors
  annotate("text", x = as_date(c("2021-02-01", "2021-02-01")), y = c(h, h2), label = c(h, h2))


## how I would do the plot
ggplot(hob_long, aes(x = date, y = value, group = metric, color = metric)) + 
  geom_line() + 
  geom_texthline(aes(yintercept=h, label = "Culex development"), lty = 2, colour= 'darkorange1') +
  geom_texthline(aes(yintercept=h2, label = "Avian malaria development"), lty = 2, colour= 'red') 

Created on 2023-01-18 with reprex v2.0.2

tjebo
  • 21,977
  • 7
  • 58
  • 94
  • Thank you so much! That worked beautifully, and your improved graph idea is much more elegant. Thanks again! – steph Jan 30 '23 at 21:28