2

Transforming ggplot2 axes to log10 using scales::trans_breaks() can sometimes (if the range is small enough) produce un-pretty breaks, at non-integer powers of ten.

Is there a general purpose way of setting these breaks to occur only at 10^x, where x are all integers, and, ideally, consecutive (e.g. 10^1, 10^2, 10^3)?

Here's an example of what I mean.

library(ggplot2)

# dummy data
df <- data.frame(fct = rep(c("A", "B", "C"), each = 3),
                 x = rep(1:3, 3),
                 y = 10^seq(from = -4, to = 1, length.out = 9))

p <- ggplot(df, aes(x, y)) +
  geom_point() +
  facet_wrap(~ fct, scales = "free_y") # faceted to try and emphasise that it's general purpose, rather than specific to a particular axis range

The unwanted result -- y-axis breaks are at non-integer powers of ten (e.g. 10^2.8)

p + scale_y_log10(
    breaks = scales::trans_breaks("log10", function(x) 10^x),
    labels = scales::trans_format("log10", scales::math_format(10^.x))
  )

enter image description here

I can achieve the desired result for this particular example by adjusting the n argument to scales::trans_breaks(), as below. But this is not a general purpose solution, of the kind that could be applied without needing to adjust anything on a case-by-case basis.

p + scale_y_log10(
    breaks = scales::trans_breaks("log10", function(x) 10^x, n = 1),
    labels = scales::trans_format("log10", scales::math_format(10^.x))
  )

enter image description here

Should add that I'm not wed to using scales::trans_breaks(), it's just that I've found it's the function that gets me closest to what I'm after.

Any help would be much appreciated, thank you!

pyg
  • 716
  • 6
  • 18
  • maybe helpful - https://stackoverflow.com/q/5380417/7941188 – tjebo Jan 19 '21 at 12:58
  • maybe also helpful - https://stackoverflow.com/questions/15622001/how-to-display-only-integer-values-on-an-axis-using-ggplot2 – tjebo Jan 19 '21 at 13:00

1 Answers1

6

Here is an approach that at the core has the following function.

breaks = function(x) {
    brks <- extended_breaks(Q = c(1, 5))(log10(x))
    10^(brks[brks %% 1 == 0])
}

It gives extended_breaks() a narrow set of 'nice numbers' and then filters out non-integers.

This gives us the following for you example case:

library(ggplot2)
library(scales)
#> Warning: package 'scales' was built under R version 4.0.3

# dummy data
df <- data.frame(fct = rep(c("A", "B", "C"), each = 3),
                 x = rep(1:3, 3),
                 y = 10^seq(from = -4, to = 1, length.out = 9))

ggplot(df, aes(x, y)) +
  geom_point() +
  facet_wrap(~ fct, scales = "free_y") +
  scale_y_continuous(
    trans = "log10",
    breaks = function(x) {
      brks <- extended_breaks(Q = c(1, 5))(log10(x))
      10^(brks[brks %% 1 == 0])
    },
    labels = math_format(format = log10)
  )

Created on 2021-01-19 by the reprex package (v0.3.0)

I haven't tested this on many other ranges that might be difficult, but it should generalise better than setting the number of desired breaks to 1. Difficult ranges might be those just in between -but not including- powers of 10. For example 11-99 or 101-999.

teunbrand
  • 33,645
  • 4
  • 37
  • 63
  • 3
    instead of `scale_y_continuous(trans = "log",...)` you can also use `scale_y_log10` +1 – tjebo Jan 19 '21 at 12:56
  • 1
    That's right, I'm just in the habit of being verbose in this way (so that I can just change a parameter instead of a function when exploring data). – teunbrand Jan 19 '21 at 13:35
  • Wow, thank you very much @teunbrand, that's excellent. Would you mind please explaining effect the `Q` argument has on `extended_breaks()` in this context? I've tried playing around with it, but it's not clear to me what it does. Just trying to understand your solution better :) – pyg Jan 20 '21 at 23:20
  • 1
    The `Q` argument gets passed on to `labeling::extended` and it is a set of numbers that the break algorithm considers 'nice', i.e. it chooses those types of numbers. By default, this is 1, 2, 2.5, 3, 4 and 5, I think. Because it won't accept a length 1 `Q`, I had to pass it a second number too. Later I filter for multiples of 1 (`%% 1 == 0`), so it doesn't matter really what the second number is. – teunbrand Jan 20 '21 at 23:37
  • Nice solution, @teunbrand. I'm trying to use it in a function that uses egg::ggarrange at the end, so I'd like to use your solution several times as a function within a function, but I'm having trouble; I can't seem to pass the necessary info into it when I call it in the scale_y_continuous function. Any suggestions? – jsgraydon May 14 '22 at 00:56
  • @teunbrand, I've just found out that you can pass the same "nice number" twice to `labeling::extended()` to force it to use your preferred one. It won't complain. So, in this example, you could call `extended_breaks(Q = c(1, 1))` and use the result right away. – leogama Mar 11 '23 at 16:31