1

Similar to this question: Split up time series per year for plotting which has done in Python, I want to display the daily time series as multiple lines by year. How can I achieve this in R?

library(ggplot2)
library(dplyr)

# Dummy data
df <- data.frame(
  day = as.Date("2017-06-14") - 0:364,
  value = runif(365) + seq(-140, 224)^2 / 10000
)

# Most basic bubble plot
p <- ggplot(df, aes(x=day, y=value)) +
  geom_line() + 
  xlab("")
p

Out:

enter image description here

One solution is using ggplot2, but date_labels are displayed incorrectly:

library(tidyverse)
library(lubridate)
p <- df %>% 
  # mutate(date = ymd(date)) %>% 
  mutate(date=as.Date(date)) %>% 
  mutate(
    year = factor(year(date)),     # use year to define separate curves
    date = update(date, year = 1)  # use a constant year for the x-axis
  ) %>% 
  ggplot(aes(date, value, color = year)) +
  scale_x_date(date_breaks = "1 month", date_labels = "%b")

# Raw daily data
p + geom_line()

Out:

enter image description here

Alternative solution is to use gg_season from feasts package:

library(feasts)
library(tsibble)
library(dplyr)
tsibbledata::aus_retail %>%
  filter(
    State == "Victoria",
    Industry == "Cafes, restaurants and catering services"
  ) %>%
  gg_season(Turnover)

Out:

enter image description here

References:

Split up time series per year for plotting

R - How to create a seasonal plot - Different lines for years

ah bon
  • 9,293
  • 12
  • 65
  • 148

2 Answers2

3

I tend to think simple is better:

transform(df, year = format(day, "%Y")) |>
  ggplot(aes(x=day, y=value, group=year, color=year)) +
  geom_line() +
  xlab(NULL)

ggplot line by year

optionally removing the year legend with + guides(colour = "none").

r2evans
  • 141,215
  • 6
  • 77
  • 149
  • I agree with regards to this data set, but I get the impression that the OP was looking for something that could plot January to December, with potentially overlapping series showing multiple years on a single plot. From a data viz viewpoint, I would normally stick with your approach unless I really wanted to highlight a specific seasonal change in one particular year. – Allan Cameron Feb 07 '23 at 13:18
  • 1
    That's a good thought, I hadn't interpreted the question that way (I think the question is a bit vague). Thanks @AllanCameron – r2evans Feb 07 '23 at 13:20
  • ... and I lost an upvote, sigh. – r2evans Feb 07 '23 at 13:21
  • 1
    You got mine, but not everyone sees the bigger picture it seems! – Allan Cameron Feb 07 '23 at 13:23
3

If you want your x axis to represent the months from January to February, then perhaps getting the yday of the date and adding it to the first of January on a random year would be simplest:

library(tidyverse)
library(lubridate)

df <- data.frame(
  day = as.Date("2017-06-14") - 0:364,
  value = runif(365) + seq(-140, 224)^2 / 10000
)

df %>% 
  mutate(year = factor(year(day)), date = yday(day) + as.Date('2017-01-01')) %>% 
  ggplot(aes(date, value, color = year)) +
  geom_line() +
  scale_x_date(breaks = seq(as.Date('2017-01-01'), by = 'month', length = 12),
               date_labels = '%b')

Created on 2023-02-07 with reprex v2.0.2

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
  • Thanks for sharing this solution, is it possible to avoid pass `2017-01-01` as a parameter, because I hope to convert it to a general function which is suitable to any daily, weekly even monthly time serie data. – ah bon Feb 07 '23 at 13:23
  • 1
    @ahbon the point is, the year doesn't matter. This could be the first of January 2000 or 1066. It's simply specifiying that we want a number of days on from the first of January. The year never appears in the plot, so it is irrelevant, and will work on any set of dates. The year can be safely hard-coded in the body of the function without it being specificied as a parameter. – Allan Cameron Feb 07 '23 at 13:25
  • I got it, but find a new issue. If we have `NA`s in `value` column, it will not work, how could we deal with this? – ah bon Feb 07 '23 at 14:09
  • 1
    @ahbon it still works for me on your example data if I put several NA values in the value column (or indeed in the day column). I get the same axes, but gaps in the lines where the NA values are (as expected). Are you able to show a reproducible example of it not working? – Allan Cameron Feb 07 '23 at 14:15
  • Sorry, after I reviewed my code, the error is not come from `NA`s. `ggplot(aes(date, 'value', color = year))` will generate this error, instead, I should use `ggplot(aes(date, value, color = year))` or ggplot(aes(date, `value`, color = year)). – ah bon Feb 07 '23 at 14:36