0

I'm trying to create a bar plot of multiple data that has a log10 y-axis with nonzero values less than 1 like the image below (created in graphpad) based on the following data (dataframe = df):

No  Perc    Codon
1   72   ATA
2   12.7273  ATG
3   8.72727  GTG
4   0.727273     ATG
5   0.363636     GTG
6   0.363636     GTG
7   0.363636     GTG
8   0.363636     GTG
9   0.363636     GTG
10  0.363636     ATA
11  0.363636     ATA
12  0.363636     ATA
13  0.363636     ATA
14  0.363636     ATA
15  0.363636     ATA
16  0.363636     ATG
17  0.363636     ATG
18  0.363636     ATG
19  0.363636     ATG
20  0.363636     ATG

graphpad image

Using ggplot for the following data, I'm trying to create something similar based on other information from other questions to address my issues (to no avail).

ggplot(data=df,aes(x=No, y=Perc, color=Codon)) + geom_bar(stat="identity") + scale_y_log10( breaks=c(0.1, 1, 10, 100), limits=c(0.1, 100), position = "top") + coord_flip() + geom_segment( aes(x=No, xend=No, y=0.01, yend=Perc))

which produces the following image:

ggplot image

However, I would like the bar to continue in the same direction despite being less than 1. Moreover, is there a way that I can reposition the x-axis to the top of the graph and reorder the data such that the largest values are on the top as in the graphpad generated image?

I know that other similar questions have described simialr issues with the direction of the bars, but without solution.

Any help or advice for this novice would be much appreciated. Thanks!

Matt
  • 137
  • 1
  • 1
  • 12

2 Answers2

2

Let me provide another answer, using the library(scales).

You were not putting the commands in the right order, and you cannot do a real log10 transformation, as values below 1 will became negative. Using library(scales) you can do a pseudo_log transformation. Here is my solution:

library(ggplot2)
library(scales)

df = data.frame("No" = seq_len(20),
                "Perc" = c(72.0, 12.7273,
                           8.72727, rep(0.363636, 17)),
                "Codon" = as.factor(c("ATA","ATG","GTG", "ATG",
                                      rep("GTG",5), rep("ATA",6),
                                      rep("ATG",5))))

breaks <- unique(c(seq(0,1,by = 0.1), 
    seq(1,10, 1), seq(10,100, 10)))
labs = c("","0.1",rep("",8),"1", 
    rep("",8), "10", rep("",8), "100")

gg  <- ggplot(data = df, aes(x = No, y = Perc, fill = Codon))
gg + geom_bar(stat = "identity") + 
coord_flip(ylim = c(0.1,100)) + scale_x_reverse() +
    scale_y_continuous(
        trans = pseudo_log_trans(base = 10),
        breaks = breaks, position = "right",
        labels = labs)

Codon barplot with inverted scale

Hope it helps.

Oriol

Oriol
  • 56
  • 5
  • Thanks Oriol, this really helped and is exactly what I'm looking for. And thanks for the teaching points as well. – Matt Dec 12 '19 at 22:45
  • Actually, I was wondering if there was a way that I could format the y-axis limits as well as log tick annotation. I've tried to use the argument `limits = c(0.1, 100)` within the `scale_y_continuous` function but it looks like geom_bar doesn't like that and results in the warning: `Removed 20 rows containing missing values (geom_bar)`. Moreover, annotation of log ticks fails with the function `annotate_logticks()`and results in the error: `Error in unit(xticks$x, "native") : 'x' and 'units' must have length > 0`. Any thoughts on this...? Thanks again. – Matt Dec 15 '19 at 17:06
  • I have edited the answer to show what you were asking. To my knowledge, `coord_flip` and `annotate_logticks()` are not compatible (https://stackoverflow.com/questions/20460226/annotation-logticks-and-coord-flip-seem-incompatible). You can change the `ylim` inside the `coord_flip` function as I have shown you in the edit. To the logticks, my approach would be that you directly set the logbreaks inside the scale function, and put only the labels for the more important numbers. Then you could edit the axis format with `theme` to have a better look. – Oriol Dec 16 '19 at 21:50
1

geom_bar doesn't play nicely with scale_y_log. I reckon the best way forward is to use geom_segment, as you're already attempting. Note that you're asking geom_segment to start from 0.01, but your y-axis only starts at 0.1. Tweaking the aes gets you something to work from. How's this:

library(ggplot2)

ggplot(data = df) +
  geom_segment(aes(x = No, 
                   xend = No, 
                   y = 0.1, 
                   yend = Perc, 
                   colour = Codon),
               size = 3) +
  coord_flip() +
  scale_y_log10(name = "Perc",
                breaks = c(0.1, 1, 10, 100), 
                limits = c(0.1, 100), 
                position = "top")

I'll include the data in an easy to use way here, if anyone else wants a go:


txt <- "No  Perc    Codon
1   72   ATA
2   12.7273  ATG
3   8.72727  GTG
4   0.727273     ATG
5   0.363636     GTG
6   0.363636     GTG
7   0.363636     GTG
8   0.363636     GTG
9   0.363636     GTG
10  0.363636     ATA
11  0.363636     ATA
12  0.363636     ATA
13  0.363636     ATA
14  0.363636     ATA
15  0.363636     ATA
16  0.363636     ATG
17  0.363636     ATG
18  0.363636     ATG
19  0.363636     ATG
20  0.363636     ATG"

df <- read.table(text = txt, header = TRUE)

Created on 2019-12-12 by the reprex package (v0.2.1)

MSR
  • 2,731
  • 1
  • 14
  • 24