0

I have successfully created a line a graph in R using ggplot2 with percentage on Y axis and Date/Time on the X axis, but I am unsure how to annotate inside the graph for specific date/time points when their is a high/low peak.

The examples I identified (on R-bloggers & RPubs) are annotated without using date/time, and I have made attempts to annotate it (with ggtext and annotate functions, etc), but got nowhere. Please can you show me an example of how to do this using ggplot2 in R?

The current R code below creates the line graph, but can you help me extend the code to annotate inside of the graph?

sentimentdata <- read.csv("sentimentData-problem.csv", header = TRUE, sep = ",", stringsAsFactors = FALSE)

sentimentTime <- sentimentdata %>%
        filter(between(Hour, 11, 23)) 

sentimentTime$Datetime <- ymd_hm(sentimentTime$Datetime)

library(zoo)

sentimentTime %>%
  filter(Cat %in% c("Negative", "Neutral", "Positive")) %>%
  ggplot(aes(x = Datetime, y = Percent, group = Cat, colour = Cat)) +
  geom_line() +
  scale_x_datetime(breaks = date_breaks("1 hours"), labels = date_format("%H:00")) +
  labs(title="Peak time on day of event", colour = "Sentiment Category") +
  xlab("By Hour") +
  ylab("Percentage of messages") 

Data source available via GitHub:

jr134
  • 117
  • 2
  • 13

1 Answers1

2

Since you have multiple lines and you want two labels on each line according to the maxima and minima, you could create two small dataframes to pass to geom_text calls.

First we ensure the necessary packages and the data are loaded:

library(lubridate)
library(ggplot2)
library(scales)
library(dplyr)

url <- paste0("https://raw.githubusercontent.com/jcool12/",
              "datasets/master/sentimentData-problem.csv")

sentimentdata          <- read.csv(url, stringsAsFactors = FALSE)
sentimentdata$Datetime <- dmy_hm(sentimentdata$Datetime)
sentimentTime          <- filter(sentimentdata, between(Hour, 11, 23)) 

Now we can create a max_table and min_table that hold the x and y co-ordinates and the labels for our maxima and minima:

max_table <- sentimentTime %>% 
  group_by(Cat) %>% 
  summarise(Datetime = Datetime[which.max(Percent)],
            Percent = max(Percent) + 3, 
            label = paste(trunc(Percent, 3), "%"))

min_table <- sentimentTime %>% 
  group_by(Cat) %>% 
  summarise(Datetime = Datetime[which.min(Percent)],
            Percent = min(Percent) - 3, 
            label = paste(trunc(Percent, 3), "%"))

Which allows us to create our plot without much trouble:

sentimentTime %>%
  filter(Cat %in% c("Negative", "Neutral", "Positive")) %>%
  ggplot(aes(x = Datetime, y = Percent, group = Cat, colour = Cat)) +
    geom_line() +
    geom_text(data = min_table, aes(label = label)) +     # minimum labels
    geom_text(data = max_table, aes(label = label)) +     # maximum labels
    scale_x_datetime(breaks = date_breaks("1 hours"), 
                     labels = date_format("%H:00")) +
    labs(title="Peak time on day of event", colour = "Sentiment Category") +
    xlab("By Hour") +
    ylab("Percentage of messages") 

enter image description here

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
  • Thank you for making this example, really appreciate it. I have done what you asked, and provided info for reproducible example. – jr134 Jun 11 '20 at 11:27
  • @jr134 I have completely rewritten my answer to make it directly applicable to your problem – Allan Cameron Jun 11 '20 at 12:11
  • Thank you so much for taking the time to do this, really big help. – jr134 Jun 11 '20 at 13:52
  • I was wondering if I wanted more than two labels for each line according to the maxima and minima, how would I go about doing this? – jr134 Jun 11 '20 at 21:39
  • @jr134 the easiest way is to create a data frame with the positions and labels you need then call an extra `geom_text` using this data frame. – Allan Cameron Jun 11 '20 at 21:41