4

I have a rather long timeseries that I want to plot in ggplot, but it's sufficiently long that even using the full width of the page it's barely readable.

one time series in one panel

What I want to do instead is to divide the plot into 2 (or more, in the general case) panels one on top of each other.

I could do it manually but not only it's cumbersome but also it's hard to get the axis to have the same scale. Ideally I would like to have something like this:

ggplot(data, aes(time, y)) + 
  geom_line() +
  facet_time(time, n = 2)

And then get something like this:

one time series in multiple panesl

(This plot was made using facet_wrap(~(year(as.Date(time)) > 2000), ncol = 1, scales = "free_x"), which messes up x axis scale, it works only for 2 panels, and doesn't work well with geom_smooth())

Also, ideally it would also handle summary statistics correctly. For example, using the correct data for geom_smooth() (so facetting wouldn't do it, because at the beginning of every facet it would not use the data in the last chunk of the previous one).

Is there a way to do this?

Thank you!

Elio Campitelli
  • 1,408
  • 1
  • 10
  • 20

2 Answers2

3

You can do this by storing the plot object, then printing it twice. Each time add an option coord_cartesian:

orig_plot <- ggplot(data, aes(time, y)) + 
  geom_line() 

early <-  orig_plot + coord_cartesian(xlim = c(1982, 2000))
late  <-  orig_plot + coord_cartesian(xlim = c(2000, 2016))

That makes sure that both plots use all the data.

To plot them on the same page, use grid (I got this from the ggplot2 book, which is probably around as a pdf somewhere):

library(grid)
vp1 <- viewport(width = 1, height = .5, just = c("center", "bottom"))
vp2 <- viewport(width = 1, height = .5, just = c("center", "top"))
print(early, vp = vp1)
print(late, vp = vp2)
  • This doesn't seem to work with Date class x variable. I get `Error: Invalid input: date_trans works with objects of class Date only`. – Elio Campitelli Mar 24 '17 at 18:49
  • You could use `grid.arrange` from the `gridExtra` package to lay out the plots: `grid.arrange(early, late, ncol=1)`. – eipi10 Mar 24 '17 at 18:54
  • Using `coord_cartesian(xlim = as.Date(c(1982, 2000)))` solves the issue and the [result](http://i.imgur.com/srneYUC.png) is workable (labels are duplicated, so I'll need to remove them manually). Another problem is that it doesn't work inside an Rmarkdown document. – Elio Campitelli Mar 24 '17 at 18:54
  • In Rmarkdown, the problem is much simpler: just print `early`, then print `late`. –  Mar 24 '17 at 18:56
  • The problem I'm having with this approach is that the scales are not equal, but I think with some tinkering I can make it work. – Elio Campitelli Mar 24 '17 at 19:14
  • @user3603486 This was a very nice solution, thanks ! – Basilique Mar 25 '20 at 13:35
3

Below I create two separate plots, one for the period 1982-1999 and one for 1999-2016 and then lay them out using grid.arrange from the gridExtra package. The horizontal axes are scaled equivalently in both plots.

I also generate regression lines outside of ggplot using the loess function so that it can be added using geom_line (you can of course use any regression function here, such as lm, gam, splines, etc). With this approach the regression can be run on the entire time series, ensuring continuity of the regression line across the two panels, even though we break the time series into two halves for plotting.

library(dplyr)      # For the chaining (%>%) operator
library(purrr)      # For the map function
library(gridExtra)  # For the grid.arrange function

Function to extract a legend from a ggplot. We'll use this to get one legend across two separate plots.

# http://stackoverflow.com/questions/12539348/ggplot-separate-legend-and-plot
g_legend<-function(a.gplot){
  tmp <- ggplot_gtable(ggplot_build(a.gplot))
  leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
  legend <- tmp$grobs[[leg]]
  legend
}

# Fake data
set.seed(255)
dat = data.frame(time=rep(seq(1982,2016,length.out=500),2),
                 value= c(arima.sim(list(ar=c(0.4, 0.05, 0.5)), n=500), 
                          arima.sim(list(ar=c(0.3, -0.3, 0.6)), n=500)),
                 group=rep(c("A","B"), each=500))

Generate smoother lines using loess: We want a separate regression line for each level of group, so we use group_by with the chaining operator from dplyr:

dat = dat %>% group_by(group) %>%
        mutate(smooth = predict(loess(value ~ time, span=0.1)))

Create a list of two plots, one for each time period: We use map to create separate plots for each time period and return a list with the two plot objects as elements (you can also use base lapply for this instead of map):

pl = map(list(c(1982,1999), c(1999,2016)), 
         ~ ggplot(dat %>% filter(time >= .x[1], time <= .x[2]), 
                  aes(colour=group)) +
             geom_line(aes(time, value), alpha=0.5) +
             geom_line(aes(time, smooth), size=1) + 
             scale_x_continuous(breaks=1982:2016, expand=c(0.01,0)) +
             scale_y_continuous(limits=range(dat$value)) +
             theme_bw() +
             labs(x="", y="", colour="") +
             theme(strip.background=element_blank(),
                   strip.text=element_blank(),
                   axis.title=element_blank()))


# Extract legend as a separate graphics object
leg = g_legend(pl[[1]])

Finally, we lay out both plots (after removing legends) plus the extracted legend:

grid.arrange(arrangeGrob(grobs=map(pl, function(p) p + guides(colour=FALSE)), ncol=1),
             leg, ncol=2, widths=c(10,1), left="Value", bottom="Year")

enter image description here

eipi10
  • 91,525
  • 24
  • 209
  • 285