0

I'm trying to create a stacked barchart with gene sequencing data, where for each gene there is a tRF.type and Amino.Acid value. An example data set looks like this:

tRF <- c('tRF-26-OB1690PQR3E', 'tRF-27-OB1690PQR3P', 'tRF-30-MIF91SS2P46I')
tRF.type <- c('5-tRF', 'i-tRF', '3-tRF')
Amino.Acid <- c('Ser', 'Lys', 'Ser')
tRF.data <- data.frame(tRF, tRF.type, Amino.Acid)

I would like the x-axis to represent the amino acid type, the y-axis the number of counts of each tRF type and the the fill of the bars to represent each tRF type.

My code is:

ggplot(chart_data, aes(x = Amino.Acid, y = tRF.type, fill = tRF.type)) + 
    geom_bar(stat="identity") + 
    ggtitle("LAN5 - 4 days post CNTF treatment") +
    xlab("Amino Acid") +
    ylab("tRF type")

However, it generates this graph, where the y-axis is labelled with the categories of tRF type. How can I change my code so that the y-axis scale is numerical and represents the counts of each tRF type?

Barchart

majaw99
  • 1
  • 3

2 Answers2

0

OP and Welcome to SO. In future questions, please, be sure to provide a minimal reproducible example - meaning provide code, an image (if possible), and at least a representative dataset that can demonstrate your question or problem clearly.

TL;DR - don't use stat="identity", just use geom_bar() without providing a stat, since default is to use the counts. This should work:

ggplot(chart_data, aes(x = Amino.Acid, fill = tRF.type)) + geom_bar()

The dataset provided doesn't adequately demonstrate your issue, so here's one that can work. The example data herein consists of 100 observations and two columns: one called Capitals for randomly-selected uppercase letters and one Lowercase for randomly-selected lowercase letters.

library(ggplot2)
set.seed(1234)
df <- data.frame(
  Capitals=sample(LETTERS, 100, replace=TRUE),
  Lowercase=sample(letters, 100, replace=TRUE)
)

If I plot similar to your code, you can see the result:

ggplot(df, aes(x=Capitals, y=Lowercase, fill=Lowercase)) +
  geom_bar(stat="identity")

enter image description here

You can see, the bars are stacked, but the y axis is all smooshed down. The reason is related to understanding the difference between geom_bar() and geom_col(). Checking the documentation for these functions, you can see that the main difference is that geom_col() will plot bars with heights equal to the y aesthetic, whereas geom_bar() plots by default according to stat="count". In fact, using geom_bar(stat="identity") is really just a complicated way of saying geom_col().

Since your y aesthetic is not numeric, ggplot still tries to treat the discrete levels numerically. It doesn't really work out well, and it's the reason why your axis gets smooshed down like that. What you want, is geom_bar(stat="count").... which is the same as just using geom_bar() without providing a stat=.

The one problem is that geom_bar() only accepts an x or a y aesthetic. This means you should only give it one of them. This fixes the issue and now you get the proper chart:

ggplot(df, aes(x=Capitals, fill=Lowercase)) + geom_bar()

enter image description here

chemdork123
  • 12,369
  • 2
  • 16
  • 32
0

You want your y-axis to be a count, not tRF.type. This code should give you the correct plot: I've removed the y = tRF.type from ggplot(), and stat = "identity from geom_bar() (it is using the default value of stat = "count instead).

ggplot(tRF.data, aes(x = Amino.Acid, fill = tRF.type)) + 
     geom_bar() + 
     ggtitle("LAN5 - 4 days post CNTF treatment") +
     xlab("Amino Acid") +
     ylab("tRF type")
stlba
  • 667
  • 3
  • 13