0

In the following example, I think the height of the bars should be 500 and 45. Why are they both over 1000?

library(tidyverse)

dat <- tibble(
  x = c(1, 1, 2, 2, 2),
  grp = factor(c(0, 1, 0, 1, 2), levels = 0:2),
  y = c(200, 300, 25, 15, 5)
)

ggplot(dat, aes(x = x, y = y, fill = grp)) + 
  geom_bar(stat = "identity") + 
  scale_y_log10(labels = scales::comma)

bad_plot

If you use position = "dodge", the y-axis values appear to be correct.

ggplot(dat, aes(x = x, y = y, fill = grp)) + 
  geom_bar(stat = "identity", position = "dodge") + 
  scale_y_log10(labels = scales::comma)

good_plot

Any ideas where ggplot2 is getting its y-axis values in the first plot?

Jake Fisher
  • 3,220
  • 3
  • 26
  • 39
  • 2
    It looks like [this github issue](https://github.com/tidyverse/ggplot2/issues/3671) is about something very similar. – aosmith Dec 19 '19 at 22:35
  • 2
    You expect the bar to give you the sum of the counts. But with the log transformation, you're actually getting the product of the counts. I find the accepted answer to this post -- https://stackoverflow.com/questions/40580937/ggplots-scale-y-log10-behavior -- to be very useful here. – AHart Dec 20 '19 at 00:25
  • 2
    And some suggested alternatives to `geom_bar` here: https://stackoverflow.com/questions/9502003/ggplot-scale-y-log10-issue/9507037#9507037. – AHart Dec 20 '19 at 00:26
  • Thank you! This is extremely helpful. It looks like the key issue is that logging the axis means that the top bar of the stack will look much shorter than the bottom bar, even if they represent the same value. ggplot gives the product, which makes the area look roughly correct, but makes the y axis very wrong. You can get around that with `coord_trans(y = "log1p")`. – Jake Fisher Dec 23 '19 at 13:43
  • I'll vote to reopen this, and if @aosmith or @AHart want to add an answer that mentions the elements in the comments -- namely: (1) the code in the question is giving you the product, (2) `coord_trans()` gives a workaround, and (3) `coord_trans()` produces a weird scaling of the bars, which is probably why `scale_y_log10()` doesn't work in the first place -- I'll accept it as an answer. – Jake Fisher Dec 23 '19 at 13:49

0 Answers0