0

This is related to earlier question: R: How to pass parameters to ggplot geom_ within a function?

I want to be able to write utility functions that use geom_xxxxx functions. Here is an example I want to have a function that overlays milestones on any time-series plot, as shown below.

set.seed(0);   library(data.table);  library(ggplot2)
  DT <- data.table(
    Date = as.Date(1:100, origin="2010-01-01"),
    state = LETTERS[1:3],
    sex = c("m","f"),
    value = as.integer(runif(1:100)*100) 
  )
  dtMilestones <- data.table(
    Date = paste( "2010-15-", 2:4) %>% ydm,
    Event = paste( "Phase ", 1:3)
  )
  
  g0 <- ggplot(DT, aes(Date, value)) +
    geom_line() +
    facet_grid(state ~ sex, scales = "free")
  g0
  
  g <- g0
  for (i in 1:nrow(dtMilestones)) {
    g <- g + 
      geom_label(aes(x=dtMilestones$Date[i], max(value), label=dtMilestones$Event[i]),label.size = 0.5) +
      geom_vline(xintercept=dtMilestones$Date[i], linetype=5)
  }
  g

The code above creates the plots shown below (without milestones and with milestones) (ignore the wrong label placement - that's for another stackoverflow question)

enter image description here

enter image description here

How can I make a function out of it? I tried this code below and it does not work


  g.add.milestones <- function(dtMilestones) {
    gg <- list()
    for (i in 1:nrow(dtMilestones)) {
      gg[[i]] <- 
        geom_label(aes(x=milestones[i], max(value), label=dtMilestones$Event[i]),label.size = 0.5)
      gg[[i+nrow(dtMilestones)]]  <- 
        geom_vline(xintercept=dtMilestones$Date[i], linetype=5)
    }
  } 
  g0 + g.add.milestones(dtMilestones)

PS. If anyone can suggest the code that prints labels where needed (besides EACH vertical line on top of the highest point in each facet - currently all of them are printed in the )

IVIM
  • 2,167
  • 1
  • 15
  • 41

2 Answers2

3

Does this do what you want?

g.add.milestones <- function(dtMilestones) {
  list(
    geom_vline(xintercept = dtMilestones$Date, linetype=5),
    geom_label(aes(x=Date, y = Inf, label=Event),
               label.size = 0.5, vjust = 1,
               data = dtMilestones)
  )
}
  
g0 + g.add.milestones(dtMilestones)

enter image description here

Jon Spring
  • 55,165
  • 4
  • 35
  • 53
  • Beautiful solution! (and also fast! - When I was using `for` loop to plot this, it was taking 100 longer..). I may only wonder why this such useful functionality of geom_** functions is not documented well anywhere, as it seems. – IVIM Mar 21 '23 at 01:52
  • Btw, this function is extremely handy for applications such as http://opencanada.info/ where you have multiple timeseries that you need to relate to one another (like vaccines intakes to spike in deaths) – IVIM Mar 21 '23 at 01:56
2

The final statement in your function - i.e. the one whose value gets returned - is a for loop. for loops always return NULL. We can see this with a short dummy example

x = for (i in 1) 1
x
# NULL

you need to return gg from your function by ending the function with either gg or return(gg)

dww
  • 30,425
  • 5
  • 68
  • 111