2

This question is related to a previous post.

Say I have this set of data test:

   a b       c
1  a x      NA
2  b x 5.1e-03
3  c x 2.0e-01
4  d x 6.7e-05
5  e x      NA
6  f y 6.2e-05
7  g y 1.0e-02
8  h y 2.5e-03
9  i y 9.8e-02
10 j y 8.7e-04

> dput(test)
structure(list(a = structure(1:10, .Label = c("a", "b", "c", 
"d", "e", "f", "g", "h", "i", "j"), class = "factor"), b = structure(c(1L, 
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L), .Label = c("x", "y"), class = "factor"), 
c = c(NA, 0.0051, 0.2, 6.7e-05, NA, 6.2e-05, 0.01, 0.0025, 
0.098, 0.00087)), .Names = c("a", "b", "c"), row.names = c(NA, 
-10L), class = "data.frame")

Plotting it regularly with ggplot will give this graph:

ggplot of test

> ggplot(test,  aes(fill=a,y=c,x=b)) + 
  geom_bar(position="dodge",stat="identity")

How do I set the y-axis as log scale (e.g., 0, 10-6, 10-5, 10-4 ... 100) so that the height of the bars won't be too far apart without directly log-transforming the data? Also, how do I accomplish this in such a way that it still shows the NA values as zeroes in the graph?

I also tried the scale_y_log10() function but the bars go from top to bottom. I would like them not to be that way.

enter image description here

Thank you!

mnm
  • 1,962
  • 4
  • 19
  • 46
Dodong
  • 57
  • 1
  • 7

1 Answers1

3

You can use geom_segment instead rather than geom_bar to specify you want a bar from 0 to test$c value. A warning will be issued as we are still using scale_y_log10().

We need to create a segment from each test$a so aes(x=a, xend=a), and use facet_wrap to separate test$b x and y instead.

gg <- ggplot(test) + 
  geom_segment(aes(colour=a, y=0, yend=c, x=a, xend=a), size=10) +
  scale_y_log10() + facet_wrap(~b, scales="free_x") + 
  ylab("log10 value") + xlab("")
gg

I am not a fan of replacing NA with 0, a missing value is not 0. Rather just label NA.

test$c_label <- test$c
test$c_label[is.na(test$c)] <- "NA"

gg + geom_label(data=subset(test, is.na(test$c)), aes(x=a, y=0.00001, label=c_label), size=5)

While this might be a work-around, I complete agree with @dww's comment - "You should not use a log scale with bar plots. The base at log(0) is impossible to plot. Choice of a different base value is arbitrary and can be used to make the bars look as similar or as different as you wish depending on the value chosen. This is a form of misleading graph. Use a dot plot, or something else instead if you really need log scale."

enter image description here

Djork
  • 3,319
  • 1
  • 16
  • 27