0

I have a dataframe that contains information on gene expression level.

Column names are gene names and row names are patient IDs.

How can I make bar plot where X axis is Gene names and Y axis is expression level?

I cannot find a way to label Gene A and Gene B in ggplot2 without merging them into the same column and type Gene A and B in a separate column.

There has to be a simple way to do this without changing the structure of data, but I cannot find it.

#Example Data
df <- data.frame(1:5,3:7)
colnames(df) <- c("Gene_A","Gene_B")
row.names(df) <- c("Pat_A","Pat_B","Pat_C","Pat_D","Pat_E")

I have tried to use one discrete, one continuous method as ggplot2 cheat sheet suggested.

f <- ggplot(mpg, aes(class, hwy)) + geom_...()

In the above code class is gene names, but I cannot determine what I should use for hwy.

Here is an example graph of what I want to get

AndrewGB
  • 16,126
  • 5
  • 18
  • 49
xin sun
  • 3
  • 2
  • Please see [How to create a Minimal, Reproducible Example](https://stackoverflow.com/help/minimal-reproducible-example) and [How to make a great R reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) and edit your question to include example 'input' data and expected output. – jared_mamrot Jan 19 '22 at 03:33

1 Answers1

1

You can do something like this with tidyverse, where you pipe everything into ggplot so that you do not change the original dataframe. First, I summarize the data, then pipe the summary dataframe into ggplot to plot the two bars and error.

library(tidyverse)

df %>% 
  pivot_longer(everything()) %>% 
  group_by(name) %>% 
  summarise(mean = mean(value),
            sd = sd(value)) %>% 
  ggplot(., aes(name, mean)) + 
  geom_col(fill=c("black", "grey"), colour="black") +  
  geom_errorbar(aes(ymin = mean - sd, ymax = mean + sd), width=0.2) +
  xlab("Gene") +
  ylab("Expression Level") +
  theme_classic()

Output

enter image description here

Or if you do not want to pivot at all, then you can just manually build each bar.

ggplot(df) + 
  geom_bar(aes(x = "Gene_A", y = Gene_A, colour = "Gene_A"), stat = "summary", fun = "mean", fill=c("black"), colour="black") + 
  geom_bar(aes(x = "Gene_B", y = Gene_B, colour = "Gene_B"), stat = "summary", fun = "mean", fill=c("grey"), colour="black") +
  xlab("Gene") +
  ylab("Expression Level") +
  theme_classic()

Output

enter image description here

AndrewGB
  • 16,126
  • 5
  • 18
  • 49