40

I would like to display months (in abbreviated form) along the horizontal axis, with the corresponding year printed once. I know how to display month-year:

enter image description here

The un-needed repetition of the year clutters the labels. Instead I would like something like this:

enter image description here

except that the year would be printed below the months.

I printed the year above the axis labels, because that's the best I could do. This follows a limitation of the annotate() function, which gets clipped if it lies outside of the plot area. I am aware of possible workarounds based on annotate_custom(), but I couldn't make them to work with date objects (I did not try to convert dates to numbers and back to dates again, as it seemed more complicated than hopefully necessary)

I'm wondering if the new dup_axis() could be hijacked for this purpose. If instead of sending the duplicated axis to the opposite side of the panel, it could send it a few lines below the duplicated axis, then perhaps it would just be a matter of setting up one axis with panel.grid.major blanked out and the labels set to %b, while the other axis would have panel.grid.minor blanked out and the labels set to %Y. (an added challenge is that the year labels would be shifted to October instead of January)

These questions are related. However, the annotate_custom() function and textGrob() functions do not play well with dates, as far as I can tell.

how-can-i-add-annotations-below-the-x-axis-in-ggplot2

displaying-text-below-the-plot-generated-by-ggplot2

Data and basic code below:

    library("ggplot2")
    library("scales")
    ggplot(data = df, aes(x = Date, y = value)) + geom_line() +
        scale_x_date(date_breaks = "2 month", date_minor_breaks = "1 month", labels = date_format("%b %Y")) +
        xlab(NULL)

    ggplot(data = df, aes(x = Date, y = value)) + geom_line() +
        scale_x_date(date_minor_breaks = "2 month", labels = date_format("%b")) +   
        annotate(geom = "text", x = as.Date("1719-10-01"), y = 0, label = "1719") +
        annotate(geom = "text", x = as.Date("1720-10-01"), y = 0, label = "1720") +
        xlab(NULL)


    # data
    df <- structure(list(Date = structure(c(-91455, -91454, -91453, -91452, 
    -91451, -91450, -91448, -91447, -91446, -91445, -91444, -91443, 
    -91441, -91440, -91439, -91438, -91437, -91436, -91434, -91433, 
    -91431, -91430, -91429, -91427, -91426, -91425, -91424, -91423, 
    -91422, -91420, -91419, -91418, -91417, -91416, -91415, -91413, 
    -91412, -91411, -91410, -91409, -91408, -91406, -91405, -91404, 
    -91403, -91402, -91401, -91399, -91398, -91397, -91396, -91395, 
    -91394, -91392, -91391, -91390, -91389, -91388, -91387, -91385, 
    -91384, -91382, -91381, -91380, -91379, -91377, -91376, -91375, 
    -91374, -91373, -91372, -91371, -91370, -91369, -91368, -91367, 
    -91366, -91364, -91363, -91362, -91361, -91360, -91359, -91357, 
    -91356, -91355, -91354, -91353, -91352, -91350, -91349, -91348, 
    -91347, -91346, -91345, -91343, -91342, -91341, -91340, -91339, 
    -91338, -91336, -91335, -91334, -91333, -91332, -91331, -91329, 
    -91328, -91327, -91326, -91325, -91324, -91322, -91321, -91320, 
    -91319, -91315, -91314, -91313, -91312, -91311, -91310, -91308, 
    -91307, -91306, -91305, -91304, -91303, -91301, -91300, -91299, 
    -91298, -91297, -91296, -91294, -91293, -91292, -91291, -91290, 
    -91289, -91287, -91286, -91285, -91284, -91283, -91282, -91280, 
    -91279, -91278, -91277, -91276, -91275, -91273, -91272, -91271, 
    -91270, -91269, -91268, -91266, -91265, -91264, -91263, -91262, 
    -91261, -91259, -91258, -91257, -91256, -91255, -91254, -91252, 
    -91251, -91250, -91249, -91248, -91247, -91245, -91244, -91243, 
    -91242, -91241, -91240, -91238, -91237, -91236, -91235, -91234, 
    -91233, -91231, -91230, -91229, -91228, -91227, -91226, -91224, 
    -91223, -91222, -91221, -91220, -91219, -91217, -91216, -91215, 
    -91214, -91213, -91212, -91210, -91209, -91208, -91207, -91205, 
    -91201, -91200, -91199, -91198, -91196, -91195, -91194, -91193, 
    -91192, -91191, -91189, -91188, -91187, -91186, -91185, -91184, 
    -91182, -91181, -91180, -91179, -91178, -91177, -91175, -91174, 
    -91173, -91172, -91171, -91170, -91168, -91167, -91166, -91165, 
    -91164, -91163, -91161, -91160, -91159, -91158, -91157, -91156, 
    -91154, -91153, -91152, -91151, -91150, -91149, -91147, -91146, 
    -91145, -91144, -91143, -91142, -91140, -91139, -91138, -91131, 
    -91130, -91129, -91128, -91126, -91125, -91124, -91123, -91122, 
    -91121, -91119, -91118, -91117, -91116, -91115, -91114, -91112, 
    -91111, -91110, -91109, -91108, -91107, -91104, -91103, -91102, 
    -91101, -91100, -91099, -91097, -91096, -91095, -91094, -91093, 
    -91091, -91090, -91089, -91088, -91087, -91086, -91084, -91083, 
    -91082, -91081, -91080, -91079, -91077, -91076, -91075, -91074, 
    -91073, -91072, -91070, -91069, -91068, -91065, -91063, -91062, 
    -91061, -91060, -91059, -91058, -91056, -91055, -91054, -91053, 
    -91052, -91051, -91049, -91048, -91047, -91046, -91045, -91044, 
    -91042, -91041, -91040, -91039, -91038, -91037, -91035, -91034, 
    -91033, -91032, -91031, -91030, -91028, -91027, -91026, -91025, 
    -91024, -91023, -91021, -91020, -91019, -91018, -91017, -91016, 
    -91014, -91013, -91012, -91011, -91010, -91009, -91007, -91006, 
    -91005, -91004, -91003, -91002, -91000, -90999, -90998, -90997, 
    -90996, -90995, -90993, -90992, -90991, -90990, -90989, -90988, 
    -90986, -90985, -90984, -90983, -90982), class = "Date"), value = c(113, 
    113, 113, 113, 114, 114, 114, 115, 115, 115, 116, 116, 116, 116, 
    117, 117, 117, 117, 116, 117, 116, 116, 116, 117, 117, 117, 117, 
    117, 117, 117, 116, 117, 116, 116, 116, 117, 117, 117, 117, 117, 
    117, 117, 116, 116, 117, 117, 117, 117, 117, 117, 117, 117, 117, 
    117, 117, 118, 118, 118, 118, 117, 118, 117, 117, 117, 117, 117, 
    117, 118, 116, 116, 116, 116, 116, 116, 116, 117, 117, 118, 118, 
    118, 118, 118, 119, 120, 120, 119, 119, 120, 120, 121, 121, 122, 
    124, 124, 122, 123, 124, 123, 123, 123, 123, 123, 124, 124, 126, 
    126, 126, 126, 126, 125, 125, 126, 127, 126, 126, 125, 126, 126, 
    126, 128, 128, 128, 130, 133, 131, 133, 134, 134, 134, 136, 136, 
    136, 135, 135, 135, 136, 136, 136, 136, 135, 135, 135, 135, 130, 
    129, 129, 130, 131, 136, 138, 155, 157, 161, 170, 174, 168, 165, 
    169, 171, 181, 184, 182, 179, 181, 179, 175, 177, 177, 174, 170, 
    174, 173, 178, 173, 178, 179, 182, 184, 184, 180, 181, 182, 182, 
    184, 184, 188, 195, 198, 220, 255, 275, 350, 310, 315, 320, 320, 
    316, 300, 310, 310, 320, 317, 313, 312, 310, 297, 285, 285, 286, 
    288, 315, 328, 338, 344, 345, 352, 352, 342, 335, 343, 340, 342, 
    339, 337, 336, 336, 342, 347, 352, 352, 351, 352, 352, 351, 352, 
    352, 355, 375, 400, 452, 487, 476, 475, 473, 485, 500, 530, 595, 
    720, 720, 770, 750, 770, 750, 735, 740, 745, 735, 700, 700, 750, 
    760, 755, 755, 760, 760, 765, 950, 950, 950, 875, 875, 875, 880, 
    880, 880, 900, 900, 900, 880, 880, 890, 895, 890, 880, 870, 870, 
    870, 870, 870, 860, 860, 860, 860, 850, 840, 810, 820, 810, 810, 
    805, 810, 805, 820, 815, 820, 805, 790, 800, 780, 760, 765, 750, 
    740, 820, 810, 800, 800, 775, 750, 810, 750, 740, 700, 705, 660, 
    630, 640, 595, 590, 570, 565, 535, 440, 400, 410, 400, 405, 390, 
    370, 300, 300, 180, 200, 310, 290, 260, 260, 275, 260, 270, 265, 
    255, 250, 210, 210, 200, 195, 210, 215, 240, 240, 220, 220, 220, 
    220, 210, 212, 208, 220, 210, 212, 208, 220, 215, 220, 214, 214, 
    213, 212, 210, 210, 195, 195, 160, 160, 175, 205, 210, 208, 197, 
    181, 185)), .Names = c("Date", "value"), row.names = c(NA, 393L
    ), class = "data.frame")
Henrik
  • 65,555
  • 14
  • 143
  • 159
PatrickT
  • 10,037
  • 9
  • 76
  • 111
  • why not just use `colour = format(Date,"%Y")` in the `aes()`. It is cleaner imo instead of trying to hack together a custom x-axis. – mtoto Jun 18 '17 at 16:08
  • related https://stackoverflow.com/questions/20571306/multi-row-x-axis-labels-in-ggplot-line-chart – tjebo Jul 24 '21 at 09:07

4 Answers4

63

The code below provides two potential options for adding year labels.

Option 1a: Faceting

You could use faceting to mark the years. For example:

library(ggplot2)
library(lubridate)

ggplot(df, aes(Date, value)) +
  geom_line() +
  scale_x_date(date_labels="%b", date_breaks="month", expand=c(0,0)) +
  facet_grid(~ year(Date), space="free_x", scales="free_x", switch="x") +
  theme_bw() +
  theme(strip.placement = "outside",
        strip.background = element_rect(fill=NA,colour="grey50"),
        panel.spacing=unit(0,"cm"))

enter image description here

Note that with this approach, if there are missing dates at the beginning or end of a year (by "missing", I mean rows for those dates are not even present in the data) then the x-axis will start/end at the first/last date in the data for that year, rather than go from Jan-1 to Dec-31. In that case, you'd need to add in rows for the missing dates and either NA for value or interpolate value. In addition, with this method there is no space or line between December 31 of one year and January 1 of the next year, so there's a discontinuity across each year.

Option 1b: Faceting + centered month labels

To address @AF7's comment. You can center the month labels by adding some spaces before each label. But you have to choose the number of spaces manually, depending on the physical size of the plot when you print it to a device. (There's probably a way to center the labels programmatically based on the internal grob measurements, but I'm not sure how to do it.) I've also removed the minor vertical gridlines and lightened the line between years.

ggplot(df, aes(Date, value)) +
  geom_line() +
  scale_x_date(date_labels=paste(c(rep(" ",11), "%b"), collapse=""), 
               date_breaks="month", expand=c(0,0)) +
  facet_grid(~ year(Date), space="free_x", scales="free_x", switch="x") +
  theme_bw() +
  theme(strip.placement = "outside",
        strip.background = element_blank(),
        panel.grid.minor.x = element_blank(),
        panel.border = element_rect(colour="grey70"),
        panel.spacing=unit(0,"cm"))

enter image description here

Option 2a: Edit the x-axis label grob

Here's a more complex and finicky method (though it could likely be automated by someone who understands the structure and unit spacings of grid graphics better than I do) that avoids the pitfalls of the faceting method described above:

library(grid)

# Fake data with an extra year added for illustration
set.seed(2)
df = data.frame(Date=seq(as.Date("1718-03-01"),as.Date("1721-09-20"), by="1 day"))
df$value = cumsum(rnorm(nrow(df)))

# The plot we'll start with
p = ggplot(df, aes(Date, value)) +
  geom_vline(xintercept=as.numeric(df$Date[yday(df$Date)==1]), colour="grey60") +
  geom_line() +
  scale_x_date(date_labels="%b", date_breaks="month", expand=c(0,0)) +
  theme_bw() +
  theme(panel.grid.minor.x = element_blank()) +
  labs(x="")

enter image description here

Now we want to add the year values below and in between June and July of each year. The code below does that by modifying the x-axis label grob and is adapted from this SO answer by @SandyMuspratt.

# Get the grob
g <- ggplotGrob(p)

# Get the y axis
index <- which(g$layout$name == "axis-b")  # Which grob
xaxis <- g$grobs[[index]]   

# Get the ticks (labels and marks)
ticks <- xaxis$children[[2]]

# Get the labels
ticksB <- ticks$grobs[[2]]

# Edit x-axis label grob
# Find every index of Jun in the x-axis labels and add a newline and
# then a year label
junes = which(ticksB$children[[1]]$label == "Jun")
ticksB$children[[1]]$label[junes] = paste0(ticksB$children[[1]]$label[junes],
                                           "\n      ", unique(year(df$Date))) 

# Put the edited labels back into the plot
ticks$grobs[[2]] <- ticksB
xaxis$children[[2]] <- ticks
g$grobs[[index]] <- xaxis

# Draw the plot
grid.newpage()
grid.draw(g)

enter image description here

Option 2b: Edit the x-axis label grob and center the month labels

Below is the only change that needs to be made to Option 2a to center the month labels, but, once again, the number of spaces needs to be tweaked manually.

# Make the edit
# Center the month labels between ticks
ticksB$children[[1]]$label = paste0(paste(rep(" ",7),collapse=""), ticksB$children[[1]]$label)

# Find every index of Jun in the x-axis labels and a year label
junes = grep("Jun", ticksB$children[[1]]$label)
ticksB$children[[1]]$label[junes] = paste0(ticksB$children[[1]]$label[junes], "\n      ", unique(year(df$Date))) 

enter image description here

eipi10
  • 91,525
  • 24
  • 209
  • 285
  • That's even better than what I was trying to do! Thanks! – PatrickT Jun 18 '17 at 16:40
  • With the first method, maybe something like `panel.border = element_rect(colour="grey80")`. WIth the second method, change the `geom_vline` color to change the prominence of the line or just remove the `geom_vline` statement completely if you don't want to demarcate the years. – eipi10 Jun 18 '17 at 18:15
  • 1
    I find this x-scale a little bit misleading. I would say that the best way would be to have months surrounded by ticks, so that, e.g. "Jan" starts with a tick and ends with a tick. Currently, January starts with "Jan" and ends with "Feb". This is usually fine for numbers, but is a bit misleading for months imho. Thus I would shift the tick labels to the right by a small amount (half the width of a tick). – AF7 Jun 18 '17 at 21:15
  • It just gets better every time! Comments: __1__ ``xlab("")`` is important to make space to print the year. Thus, ``xlab(NULL)`` won't give that space. __2__ To prevent labels from clattering, the output needs to be large enough, e.g. ggsave(g, file = "a.pdf", width = 12, height = 4). __3__ you can still set ``date_breaks = "2 month"``, but then the tweaking of label placements needs to be adapted. – PatrickT Jun 19 '17 at 06:05
  • Does the centering of the labels survive printing with ``ggsave``? (I'm using the 4 years of data you gave, printing to pdf or png with ``width = 12`` and am not observing the centering of labels of option 2b) – PatrickT Jun 19 '17 at 06:23
  • 1
    My plot goes from 01/01/2010 to 06/30/2015. I don't have enough room on the x-axis to keep the first 3 letters of each month. I kept the labels J,F,M,A,M,J,J,.... You use 'Jun' to set the position of the year labels, The only work around I could think of was to change "J" for June to " J ", and change `junes = grep("June", ticksB$children[[1]]$label)` to `junes = grep(" J ", ticksB$children[[1]]$label)` (space before and after J for June). This adds extra spacing around the June labels. How could I fix this? Here's [how my plot looks](https://s1.postimg.org/5kikkc9msv/test_plot.png) – Dave Rosenman Oct 23 '17 at 16:50
15

I came upon this question and thought maybe I can add a solution. We can display both month and year in every year's first displayed month by using a simple condition. You can play with the date_breaks to remove January from the labels, and this will still work. I'm using month() and year() from lubridate.

library(tidyverse)
library(lubridate)

df %>% 
   ggplot(aes(Date, value)) +
   geom_line() +
   scale_x_date(date_breaks = "2 months", 
                labels = function(x) if_else(is.na(lag(x)) | !year(lag(x)) == year(x), 
                                             paste(month(x, label = TRUE), "\n", year(x)), 
                                             paste(month(x, label = TRUE))))

enter image description here

pasipasi
  • 1,176
  • 10
  • 8
9

If you want to try to hack together a sub-label, you could convert it to a grob. I edited this from the original post to create a function that adds the sublabels and returns a gtable object. Note that the sublabs input must be the same length as your x-axis breaks:

library(grid)
library(gtable)
library(gridExtra)

add_sublabs <- function(plot, sublabs){

  gg <- ggplotGrob(plot)

  axis_num <- which(gg$layout[,"name"] == "axis-b")

  xbreaks <- gg[["grobs"]][[axis_num]][["children"]][[2]][["grobs"]][[2]][["children"]][[1]]$x
  if(length(xbreaks) != length(sublabs)) stop("Sub-labels must be the same length as the x-axis breaks")

  to_breaks <- c(as.numeric(xbreaks),1)[which(!duplicated(sublabs, fromLast = TRUE))+1]
  sublabs_x <- diff(c(0,to_breaks))
  sublabs_labels <- sublabs[!duplicated(sublabs, fromLast = TRUE)]

  tg <- tableGrob(matrix(sublabs_labels, nrow = 1))
  tg$widths = unit(sublabs_x, attr(xbreaks,"unit"))

  pos <- gg$layout[axis_num,c("t","l")]

  gg2 <- gtable_add_rows(gg, heights = sum(tg$heights)+unit(4,"mm"), pos = pos$t)
  gg3 <- gtable_add_grob(gg2, tg, t = pos$t+1, l = pos$l)

  return(gg3)
}


#Plot and sublabels
p <- ggplot(data = df, aes(x = Date, y = value)) + geom_line() +
  scale_x_date(date_breaks = "2 month", date_minor_breaks = "1 month", labels = date_format("%b")) +
  xlab(NULL)
sublabs <- c(rep("1719",2),rep("1720",6))

#Draw
grid.draw(add_sublabs(p, sublabs))

enter image description here

Mike H.
  • 13,960
  • 2
  • 29
  • 39
  • That's very nice Mike. I can see that this would be a great way to label things like, say, "President X", "President Y" with corresponding matching fill colors. Thumbs up! – PatrickT Jun 18 '17 at 17:23
  • 2
    @PatrickT thanks! I edited it to what I think is a slightly better way. I wrote a quick function that takes a vector of sublabels and a plot. I think it's more intuitive/succinct than my original way. – Mike H. Jun 18 '17 at 18:10
2

One way to avoid the complexities would be to change the required output so that January is replaced by the year.

The lab function returns the labels given the breaks. Unexpectedly, ggplot will pass NAs to it so in the first line of the function body we replace those with some date -- it does not matter which date since such values are not subsequently used by ggplot. Finally we format the date as a year or abbreviated month depending on whether the month is January (which corresponds to the POSIXlt component mon equalling 0) or not.

library(ggplot2)
library(scales)

lab <- function(b) {
  b[is.na(b)] <- Sys.Date()
  format(b, ifelse(as.POSIXlt(b)$mon == 0, "%Y", "%b"))
}

ggplot(df, aes(Date, value)) + 
   geom_line() +
   scale_x_date(date_breaks = "month", labels = lab)

screenshot

Note: I have added Issue 2182 to the ggplot2 github issues list regarding the NAs that are passed to the label function. If subsequent versions of ggplot2 no longer pass the NAs then the first line of the body of lab could be omitted .

Update: fixed.

G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341