2

In the following code, not all areas are shown in the plot called gg2. This seems odd to me. I guess it is due to numerical imprecision in combination with the ylim operand, however, it seems odd to me. Is this a bug or desired behavior?

Thanks and best greetings, Sebastian

library('ggplot2')
library('magrittr')
library('tibble')
library('dplyr')
library('tidyr')

squeeze <- function(x, min_value = 0, max_value = 1) {
  pmin(max_value, pmax(min_value, x))
}

par_values <- 0.1 * 2^(-3:3)

result <- expand_grid(beta = par_values, 
                      gamma = par_values) %>%
  mutate(S = squeeze(gamma / beta),
         I = squeeze((beta - gamma) / beta))

result_long <- result %>% pivot_longer(cols = c('S', 'I'))

(gg1 <- result_long %>%
    ggplot(aes(x = beta, y = value, fill = name)) + geom_area() +
    facet_grid(rows = vars(gamma)))

(gg2 <- gg1 + ylim(0, 1))

(gg3 <- gg1 + ylim(0, 1.001))

Here the output of sessionInfo():

R version 4.2.1 (2022-06-23) Platform: aarch64-apple-darwin20 (64-bit) Running under: macOS Monterey 12.6.2

Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] tidyr_1.2.1 dplyr_1.0.10 tibble_3.1.8 magrittr_2.0.3 ggplot2_3.4.0

loaded via a namespace (and not attached): [1] rstudioapi_0.14 tidyselect_1.2.0 munsell_0.5.0 colorspace_2.0-3 R6_2.5.1 rlang_1.0.6 fansi_1.0.3 tools_4.2.1 grid_4.2.1 gtable_0.3.1 utf8_1.2.2
[12] cli_3.6.0 DBI_1.1.3 withr_2.5.0 ellipsis_0.3.2 assertthat_0.2.1 lifecycle_1.0.3 farver_2.1.1 purrr_1.0.1 vctrs_0.5.1 glue_1.6.2 labeling_0.4.2
[23] compiler_4.2.1 pillar_1.8.1 generics_0.1.3 scales_1.2.1 pkgconfig_2.0.3

  • I just found something useful in the help page of `ylim`: "By default, any values outside the limits specified are replaced with NA." However, the behavior still seems odd to me since `sums <- result_long %>% group_by(beta, gamma) %>% summarize(sum_SI = sum(value)) ` does not show any coordinates that are outside the specified limits. I think it is a numerical problem, however, maybe ggplot2 should increase the region by a safety margin (e.g. 0.1 percent at each side) before cropping / not showing areas? – Sebastian Gerdes Jan 30 '23 at 14:03
  • I understand the problem. Without deep-diving the problem, I suspect it may be related to floating-point comparisons (see https://stackoverflow.com/q/9508518/3358272). If you change to `gg1 + ylim(0, 1 + 1e-10)`, it does not clip the graph. If what I suspect is true, it's not technically a bug in ggplot2 or R, mostly a concern with how floating-point comparison is made. (If I'm wrong, I hope somebody will offer an alternate theory :-) – r2evans Jan 30 '23 at 14:09
  • FYI, "safety margin" perhaps, but the safety margin referred to as `expand=` (in the `scale_*` functions) isn't responsible for the data-clipping itself. – r2evans Jan 30 '23 at 14:11
  • 1
    I think the `stat = "align"` default might contribute to some imprecision. If you use `geom_area(stat = "identity")`, it renders just fine for me. – teunbrand Jan 30 '23 at 14:11
  • It definitely seems like a floating point math thing or perhaps when stacking it's something that factors in some extra y value stuff? Maybe the second thing stacked on top does not have it's "0" starting at the max value for the bottom piece, so it actually "starts" one pixel above that y value? In any case, the "fix" is also to use `coord_cartesian(ylim = c(0,1))`, since `coord_cartesian()` does not remove the values outside of the limits, it just "zooms in". `ylim()` Zooms in too, but discards data outside of the values. – chemdork123 Jan 30 '23 at 16:36

1 Answers1

0

Thanks for your input!

My take-home-messages:

  • I agree that the issue is probably due to floating point comparisons
  • coord_cartesian(ylim = c(0,1)) definitely solves the problem!

Best greetings, Sebastian