10

Inspired by the Q Finding the elbow/knee in a curve I started to play around with smooth.spline().

In particular, I want to visualize how the parameter df (degree of freedom) influences the approximation and the first and second derivative. Note that this Q is not about approximation but about a specific problem (or edge case) in visualisation with ggplot2.

First attempt: simple facet_grid()

library(ggplot2)
ggplot(ap, aes(x, y)) +
  geom_point(data = dp, alpha = 0.2) +
  geom_line() + 
  facet_grid(deriv ~ df, scales = "free_y", labeller = label_both) + 
  theme_bw()

facet_grid

dp is a data.table containing the data points for which an approximation is sought and ap is a data.table with the approximated data plus the derivatives (data are given below).

For each row, facet_grid() with scales = "free_y" has choosen a scale which displays all data. Unfortunately, one panel has kind of "outliers" which make it difficult to see details in the other panels. So, I want to "zoom in".

"Zoom in" using coord_cartesian()

ggplot(ap, aes(x, y)) +
  geom_point(data = dp, alpha = 0.2) +
  geom_line() + 
  facet_grid(deriv ~ df, scales = "free_y", labeller = label_both) + 
  theme_bw() +
  coord_cartesian(ylim = c(-200, 50))

enter image description here

With the manually selected range, more details in the panels of row 3 have been made visible. But, the limit has been applied to all panels of the grid. So, in row 1 details hardly can been distinguished.

What I'm looking for is a way to apply coord_cartesian() with specific parameters separately to each individual panel (or group of panels, e.g., rowwise) of the grid. For instance, is it possible to manipulate the ggplot object afterwards?

Workaround: Combine individual plots with cowplot

As a workaround, we can create three separate plots and combine them afterwards using the cowplot package:

g0 <- ggplot(ap[deriv == 0], aes(x, y)) +
  geom_point(data = dp, alpha = 0.2) +
  geom_line() + 
  facet_grid(deriv ~ df, scales = "free_y", labeller = label_both) + 
  theme_bw()

g1 <- ggplot(ap[deriv == 1], aes(x, y)) +
  geom_line() + 
  facet_grid(deriv ~ df, scales = "free_y", labeller = label_both) + 
  theme_bw() +
  coord_cartesian(ylim = c(-50, 50))

g2 <- ggplot(ap[deriv == 2], aes(x, y)) +
  geom_line() + 
  facet_grid(deriv ~ df, scales = "free_y", labeller = label_both) + 
  theme_bw() +
  coord_cartesian(ylim = c(-200, 100))

cowplot::plot_grid(g0, g1, g2, ncol = 1, align = "v")

enter image description here

Unfortunately, this solution

  • requires to write code to create three separate plots,
  • duplicates strips and axes and adds whitespace which isn't available for display of the data.

Is facet_wrap() an alternative?

We can use facet_wrap() instead of facet_grid():

ggplot(ap, aes(x, y)) +
  # geom_point(data = dp, alpha = 0.2) + # this line causes error message
  geom_line() + 
  facet_wrap(~ deriv + df, scales = "free_y", labeller = label_both, nrow = 3) + 
  theme_bw()

enter image description here

Now, the y-axes of every panel are scaled individually exhibiting details of some of the panels. Unfortunately, we still can't "zoom in" into the bottom right panel because using coord_cartesian() would affect all panels.

In addition, the line

geom_point(data = dp, alpha = 0.2)

strangely causes

Error in gList(list(x = 0.5, y = 0.5, width = 1, height = 1, just = "centre", : only 'grobs' allowed in "gList"

I had to comment this line out, so the the data points which are to be approximated are not displayed.

Data

library(data.table)
# data points
dp <- data.table(
  x = c(6.6260, 6.6234, 6.6206, 6.6008, 6.5568, 6.4953, 6.4441, 6.2186,
        6.0942, 5.8833, 5.7020, 5.4361, 5.0501, 4.7440, 4.1598, 3.9318,
        3.4479, 3.3462, 3.1080, 2.8468, 2.3365, 2.1574, 1.8990, 1.5644,
        1.3072, 1.1579, 0.95783, 0.82376, 0.67734, 0.34578, 0.27116, 0.058285),
  y = 1:32,
  deriv = 0)
# approximated data points and derivatives
ap <- rbindlist(
  lapply(seq(2, length(dp$x), length.out = 4),
         function(df) {
           rbindlist(
             lapply(0:2, 
                    function(deriv) {
                      result <- as.data.table(
                        predict(smooth.spline(dp$x, dp$y, df = df), deriv = deriv))
                      result[, c("df", "deriv") := list(df, deriv)]
                    })
           )
         })
)  
Community
  • 1
  • 1
Uwe
  • 41,420
  • 11
  • 90
  • 134
  • 1
    You can remove the strip labels and axis labels for the rows/columns where they are not needed and then use the `align="hv"` argument to `plot_grid` to ensure that all the panels are the same size in the final plot. [Here's an answer](http://stackoverflow.com/a/35304121/496488) I wrote a while back that solves a different problem than yours, but that uses a similar idea for separately customizing various groups of plots that go into the final layout. – eipi10 Jan 08 '17 at 19:09
  • 1
    a couple of suggestions here: http://stackoverflow.com/questions/12207419/r-how-do-i-use-coord-cartesian-on-facet-grid-with-free-ranging-axis – user20650 Jan 08 '17 at 19:15
  • @eipi10 Thank you very much. I've tried it it but it adds a lot more of code (calls to `theme`) to an already voluminous code of the `cowplot` workaround. BTW: `align = "v"` works well in terms of removing white space while `align = "hv"` maintains the white space which I thought should be eliminated by replacing strip and axis labels by `element_blank()` except for the topmost strip and the bottommost axis. – Uwe Jan 09 '17 at 07:41
  • @user20650 Thank you very much. I've tried to reproduce your answer in the link but apparently the numbering of grobs has changed with the new versions of `ggplot2`. In addition, it will be a variant of the `cowplot` workaround which requires to produce three different plots and combine them in an - admittedly very - clever way. – Uwe Jan 09 '17 at 07:48
  • Yes, `cowplot` likes large margins between plots. In the answer I linked to, I changed the plot margins in order to reduce the amount of space between panels. – eipi10 Jan 09 '17 at 07:48
  • I was also [recently tripped up by the change in ggplot2's grob structure](http://stackoverflow.com/questions/40732543/seeking-workaround-for-gtable-add-grob-code-broken-by-ggplot-2-2-0). Maybe the discussion in my question and @SandyMuspratt's answer will be helpful. – eipi10 Jan 09 '17 at 07:49
  • @eipi10 I've added `theme(plot.margin=unit(c(0,-0.15,0,-0.15), "lines"))` but the white space is still there. I believe this is because I kept the topmost strip and the bottommost axis label. When I remove all _decorations_ then `"hv"` removes all the white space but now I would have to add the strip labels and the x axis in some way... – Uwe Jan 09 '17 at 07:58
  • @eipi10 Looks as it would become a deep dive into the inner workings of `ggplot`... BTW: I've added a variant using `facet_wrap()` which ran into strange `gList` error. – Uwe Jan 09 '17 at 08:23
  • uwe. please see second message http://chat.stackoverflow.com/rooms/132675/uwe-facet for a quick way. I cant remember if the panel names used to describe row and column position in the facet layout: but it appears as if they dont now, so use `t` in the layout to find the relevant rows – user20650 Jan 09 '17 at 09:58
  • Uwe, were you able to resolve your problem using the @user20650's code in chat? – eipi10 Jan 11 '17 at 04:35
  • Also, FYI, the error you're getting with `geom_point(data = dp, alpha = 0.2)` has something to do with `dp` having only one of the two faceting variables. I was able to reproduce the same error with `mtcars1 = mtcars[,c("wt","mpg","vs")]; ggplot(mtcars, aes(wt, mpg)) + geom_point(data=mtcars1) + geom_line() + facet_wrap(~ vs + am)`. – eipi10 Jan 11 '17 at 04:44
  • @eipi10 Following @user20650's suggestions in chat I'm thinking about writing a `zoom_facet_grid` function (if time permits). Alternatives to dealing with grobs could be to tweak the object returned by `ggplot_build` or to go along [Extending existing facet function](http://ggplot2.tidyverse.org/articles/extending-ggplot2.html#extending-existing-facet-function) – Uwe Jan 11 '17 at 06:57

1 Answers1

2

Late answer, but the following hack just occurred to me. Would it work for your use case?

Step 1. Create an alternative version of the intended plot, limiting the range of y values such that scales = "free_y" gives a desired scale range for each facet row. Also create the intended facet plot with the full data range:

library(ggplot2)
library(dplyr)

# alternate plot version with truncated data range
p.alt <- ap %>%
  group_by(deriv) %>%
  mutate(upper = quantile(y, 0.75),
         lower = quantile(y, 0.25),
         IQR.multiplier = (upper - lower) * 10) %>%
  ungroup() %>%
  mutate(is.outlier = y < lower - IQR.multiplier | y > upper + IQR.multiplier) %>%
  mutate(y = ifelse(is.outlier, NA, y)) %>%

  ggplot(aes(x, y)) +
  geom_point(data = dp, alpha = 0.2) +
  geom_line() + 
  facet_grid(deriv ~ df, scales = "free_y", labeller = label_both) + 
  theme_bw()

# intended plot version with full data range
p <- p.alt %+% ap

Step 2. Use ggplot_build() to generate plot data for both ggplot objects. Apply the panel parameters of the alt version onto the intended version:

p <- ggplot_build(p)
p.alt <- ggplot_build(p.alt)

p$layout$panel_params <- p.alt$layout$panel_params
rm(p.alt)

Step 3. Build the intended plot from the modified plot data, & plot the result:

p <- ggplot_gtable(p)

grid::grid.draw(p)

plot

Note: in this example, I truncated the data range by setting all values more than 10*IQR away from the upper / lower quartile in each facet row as NA. This can be replaced by any other logic for defining outliers.

Z.Lin
  • 28,055
  • 6
  • 54
  • 94