4

I'm having some trouble with the y formatting ranges. When I use scale_y_log10() in my plot, it decides that having the scale 0.1, 10, 1000 is the way to do it. I really need it to display it as 1e-1, 1e1, 1e3. math_format help page is not helpful without the format I need to know.

Anything I can answer I will.

Phil Colson
  • 141
  • 3
  • 9
  • Does the [first example here](http://docs.ggplot2.org/0.9.3/annotation_logticks.html) help you? – Henrik Sep 03 '13 at 19:51
  • I was looking for "1e3" type of label instead. Thank you though. – Phil Colson Sep 03 '13 at 19:55
  • related - just the other way round: https://stackoverflow.com/questions/14563989/force-r-to-stop-plotting-abbreviated-axis-labels-e-g-1e00-in-ggplot2/14564026#14564026 – tjebo Jul 05 '22 at 05:56

3 Answers3

10

The problem is that R uses an not well-understood penalty mechanism for deciding whether to print in normal or scientific notation. This is decided by options( scipen ).

The value represents the penalty R applies to the number of characters it would take to print in scientific notation vs. fixed point, so options(scipen = 3) would mean that R adds 3 to the number of characters it takes to print say 1e2 and compares it to the number of characters it needs to print the fixed point equivalent and prints the number with the lower number of characters (so in this case 1e2 = 3 characters, + 3 penalty = 6, whereas 100 equals 3 characters so 100 gets printed. To fix you example just set options(scipen = -10) to always favour printing scientific notation over fixed point. So (using @PeterB's example) you can use scipen which should allow you to not worry about manual break setting...

options(scipen = -10)
ggplot(data=subset(movies, votes > 1000)) +
  aes(x = rating, y = votes / 10000) +
  geom_point()

enter image description here

Miztli
  • 103
  • 2
  • 9
Simon O'Hanlon
  • 58,647
  • 14
  • 142
  • 184
4

You can use the breaks and labels parameters of scale_y_log10 as in

library(ggplot2)

ggplot(data=subset(movies, votes > 1000)) +
  aes(x = rating, y = votes / 10000) +
  scale_y_log10(breaks = c(0.1, 1, 10), labels = expression(10^-1, 10^0, 10^1)) +
  geom_point()

This might not be an elegant solution, but it works if you only have a limited number of plots.

PeterB
  • 886
  • 10
  • 18
4

The easiest way to achieve what you ask, with automatic limits and breaks, and without side-effects is this:

library(ggplot2)
library(MASS)
library(scales)
ggplot(data=subset(movies, votes > 1000)) +
  aes(x = rating, y = votes / 10000) +
  scale_y_log10(breaks = trans_breaks("log10", function(x) 10^x, n=3), 
                labels = trans_format("log10")) +
  geom_point()

I rather prefer to use superscripts for the powers of ten, and hide the minor grid, and add ticks spaced according to logs. This is also rather easy to achieve:

ggplot(data=subset(movies, votes > 1000)) +
  aes(x = rating, y = votes / 10000) +
  scale_y_log10(breaks = trans_breaks("log10", function(x) 10^x, n=3), 
               labels = trans_format("log10", math_format(10^.x))) +
  theme(panel.grid.minor = element_blank()) +
  annotation_logticks(sides="l") + 
  geom_point()

The code above is adapted from the examples in the annotation_logticks help, annotation_logticks. There is a lot of flexibilty for adjusting the exact format.

Pedro J. Aphalo
  • 5,796
  • 1
  • 22
  • 23
  • I'll upvote this (I want to because I didn't know about this before) if you go to the effort of giving an example of this in action (you can copy paste from the link you gave if you really want!). Otherwise this is really just a comment and should be left as such. – Simon O'Hanlon Sep 04 '13 at 08:28
  • The link to annotation_logticks is dead - probably now https://ggplot2.tidyverse.org/reference/annotation_logticks.html – tjebo Jul 05 '22 at 05:49