I have a (biological) data frame of gene abundances and the metabolic processes they contribute to.
> head(as.data.frame(df))
Total_abundance process1 process10 process11 process12 process13
1 53132920 Glycolysis / Gluconeogenesis 0 0 0 0
2 35708645 Pyrimidine metabolism 0 0 0 0
3 33620967 Arginine biosynthesis 0 0 0 0
4 26119946 Fatty acid degradation 0 0 0 0
5 26119946 Fatty acid degradation 0 0 0 0
6 20600274 Fatty acid degradation 0 0 0 0
process2 process3 process4 process5
1 Pyruvate metabolism Propanoate metabolism Metabolic pathways Carbon metabolism
2 Selenocompound metabolism 0 0 0
3 Alanine, aspartate and glutamate metabolism Nitrogen metabolism Metabolic pathways 0
4 Butanoate metabolism Metabolic pathways Carbon metabolism Fatty acid metabolism
5 Butanoate metabolism Metabolic pathways Carbon metabolism Fatty acid metabolism
6 Valine, leucine and isoleucine degradation alpha-Linolenic acid metabolism Metabolic pathways Fatty acid metabolism
process6 process7 process8 process9
1 0 0 0 0
2 0 0 0 0
3 0 0 0 0
4 0 0 0 0
5 0 0 0 0
6 0 0 0 0
In this data frame I obtained, unfortunately some of the genes contribute to more than one metabolic process (if they only contribute to one process, the other columns processX
has the number 0
).
Currently, I am plotting only the first column, but I would like to integrate the other processes as well. This is how I am currently plotting the data:
df %>%
ggplot(aes(x = process1, y = Total_abundance, fill = process1)) +
geom_bar(stat = "identity")
But this is only for process1
, I am ignoring all the other columns. How can I integrate the other columns (where they are not 0
)? I thought of reshaping the data frame but I am not sure how to do this.
Thank you. :-)