Transform y axis in bar plot using scale_y_log10()

Question

Using the data.frame below, I want to have a bar plot with y axis log transformed.

I got this plot

using this code

ggplot(df, aes(x=id, y=ymean , fill=var, group=var)) +
  geom_bar(position="dodge", stat="identity",
           width = 0.7,
           size=.9)+
  geom_errorbar(aes(ymin=ymin,ymax=ymax),
                size=.25,   
                width=.07,
                position=position_dodge(.7))+
  theme_bw()

to log transform y axis to show the "low" level in B and D which is close to zero, I used

+scale_y_log10()

which resulted in

Any suggestions how to transform y axis of the first plot?

By the way, some values in my data is close to zero but none of it is zero.

UPDATE

Trying this suggested answer by @computermacgyver

ggplot(df, aes(x=id, y=ymean , fill=var, group=var)) +
  geom_bar(position="dodge", stat="identity",
           width = 0.7,
           size=.9)+
  scale_y_log10("y",
                breaks = trans_breaks("log10", function(x) 10^x),
                labels = trans_format("log10", math_format(10^.x)))+
  geom_errorbar(aes(ymin=ymin,ymax=ymax),
                size=.25,   
                width=.07,
                position=position_dodge(.7))+
  theme_bw()

I got

DATA

dput(df)
structure(list(id = structure(c(7L, 7L, 7L, 1L, 1L, 1L, 2L, 2L, 
2L, 6L, 6L, 6L, 5L, 5L, 5L, 3L, 3L, 3L, 4L, 4L, 4L), .Label = c("A", 
"B", "C", "D", "E", "F", "G"), class = "factor"), var = structure(c(1L, 
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 
3L, 1L, 2L, 3L), .Label = c("high", "medium", "low"), class = "factor"), 
    ymin = c(0.189863418, 0.19131948, 0.117720496, 0.255852069, 
    0.139624146, 0.048182771, 0.056593774, 0.037262727, 0.001156667, 
    0.024461299, 0.026203592, 0.031913077, 0.040168571, 0.035235902, 
    0.019156667, 0.04172913, 0.03591233, 0.026405094, 0.019256055, 
    0.011310755, 0.000412414), ymax = c(0.268973856, 0.219709677, 
    0.158936508, 0.343307692, 0.205225352, 0.068857143, 0.06059596, 
    0.047296296, 0.002559633, 0.032446541, 0.029476821, 0.0394, 
    0.048959184, 0.046833333, 0.047666667, 0.044269231, 0.051, 
    0.029181818, 0.03052381, 0.026892857, 0.001511628), ymean = c(0.231733739333333, 
    0.204891473333333, 0.140787890333333, 0.295301559666667, 
    0.173604191666667, 0.057967681, 0.058076578, 0.043017856, 
    0.00141152033333333, 0.0274970166666667, 0.0273799226666667, 
    0.0357511486666667, 0.0442377366666667, 0.0409452846666667, 
    0.0298284603333333, 0.042549019, 0.0407020586666667, 0.0272998796666667, 
    0.023900407, 0.016336106, 0.000488014)), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -21L), .Names = c("id", 
"var", "ymin", "ymax", "ymean"))

@Hardikgupta could you please clarify which I you mean in plot? — shiny, Oct 09 '17 at 05:22
See answer by [@computermacgyver](https://stackoverflow.com/a/18526649/680068) — zx8754, Oct 09 '17 at 07:28
@zx8754 Many thanks for your time and help. I tried the answer you suggested https://stackoverflow.com/a/18526649/5420677. However, it gave me upside down plot. Please, check the edit. — shiny, Oct 09 '17 at 07:54
So you don't want log transformation, but only want to display yaxis labels as `10^n`? — zx8754, Oct 09 '17 at 08:03
@zx8754 I need to show the levels that are close to zero through log transformation. In my case, in B and D variables, I want to show the "low" level which is so close to zero. Please, check the difference between plot2 and plot3 and how "low" level in B and D is not close to zero anymore in plot3 but the orientation changed. — shiny, Oct 09 '17 at 08:07

score 2 · Accepted Answer · answered Oct 10 '17 at 10:40

As @Miff has written bars are generally not useful on a log scale. With barplots, we compare the height of the bars to one another. To do this, we need a fixed point from which to compare, usually 0, but log(0) is negative infinity.

So, I would strongly suggest that you consider using geom_point() instead of geom_bar(). I.e.,

ggplot(df, aes(x=id, y=ymean , color=var)) +
  geom_point(position=position_dodge(.7))+
  scale_y_log10("y",
                breaks = trans_breaks("log10", function(x) 10^x),
                labels = trans_format("log10", math_format(10^.x)))+
  geom_errorbar(aes(ymin=ymin,ymax=ymax),
                size=.25,   
                width=.07,
                position=position_dodge(.7))+
  theme_bw()

If you really, really want bars, then you should use geom_rect instead of geom_bar and set your own baseline. That is, the baseline for geom_bar is zero but you will have to invent a new baseline in a log scale. Your Plot 1 seems to use 10^-7.

This can be accomplished with the following, but again, I consider this a really bad idea.

ggplot(df, aes(xmin=as.numeric(id)-.4,xmax=as.numeric(id)+.4, x=id, ymin=10E-7, ymax=ymean, fill=var)) +
  geom_rect(position=position_dodge(.8))+
  scale_y_log10("y",
                breaks = trans_breaks("log10", function(x) 10^x),
                labels = trans_format("log10", math_format(10^.x)))+
  geom_errorbar(aes(ymin=ymin,ymax=ymax),
                size=.25,   
                width=.07,
                position=position_dodge(.8))+
  theme_bw()

zx8754 · Answer 2 · 2017-10-09T08:34:38.753

1

If you need bars flipped, maybe calculate your own log10(y), see example:

library(ggplot2)
library(dplyr)

# make your own log10
dfPlot <- df %>% 
  mutate(ymin = -log10(ymin),
         ymax = -log10(ymax),
         ymean = -log10(ymean))

# then plot
ggplot(dfPlot, aes(x = id, y = ymean, fill = var, group = var)) +
  geom_bar(position = "dodge", stat = "identity",
           width = 0.7,
           size = 0.9)+
  geom_errorbar(aes(ymin = ymin, ymax = ymax),
                size = 0.25,   
                width = 0.07,
                position = position_dodge(0.7)) +
  scale_y_continuous(name = expression(-log[10](italic(ymean)))) + 
  theme_bw()

edited Oct 09 '17 at 08:34

answered Oct 09 '17 at 08:28

zx8754

52,746
12
114
209

Many thanks for your time and help. Please,check plot2 vs plot3 and your answer. Plot2 is the values without any transformation and "low" level, the blue, bar is the lowest bar for A, B, C, D, E, and G. However, in plot3 and your answer it became the highest bar. I wonder why? – shiny Oct 09 '17 at 08:44
@aelwan Because this is what `-log10` does. Try: `-log10(10000); -log10(0.0001)` – zx8754 Oct 09 '17 at 08:54

score 1 · Answer 3 · answered Oct 09 '17 at 09:08

Firstly, don't do it! The help file from ?geom_bar says:

A bar chart uses height to represent a value, and so the base of the bar must always be shown to produce a valid visual comparison. Naomi Robbins has a nice article on this topic. This is why it doesn't make sense to use a log-scaled y axis with a bar chart.

To give a concrete example, the following is a way of producing the graph you want, but a larger k will also be correct but produce a different plot visually.

k<- 10000  

ggplot(df, aes(x=id, y=ymean*k , fill=var, group=var)) +
  geom_bar(position="dodge", stat="identity",
           width = 0.7,
           size=.9)+
  geom_errorbar(aes(ymin=ymin*k,ymax=ymax*k),
                size=.25,   
                width=.07,
                position=position_dodge(.7))+
  theme_bw() + scale_y_log10(labels=function(x)x/k)

Transform y axis in bar plot using scale_y_log10()

3 Answers3

k=1e4

k=1e6

Linked