1

I want to plot the results of a benchmark of several bioinformatics tools, using ggplot. I would like t have all the bars on the same graph instead of having one graph for each tool. I already have an output with LibreOffice (see image below), but I want to re-do it with ggplot.

For now I have this kind of code for each tool (example with the first one) :

data_reduced <- read.table("benchmark_groups_4sps", sep="\t", header=TRUE)

p<-ggplot(data=data_reduced, aes(x=Nb_sps, y=OrthoFinder)) +
    geom_bar(stat="identity", color="black", fill="red") +
    xlab("Number of species per group") + ylab("Number of groups") +
    geom_text(aes(label=OrthoFinder), vjust=1.6, color="black", size=3.5)

But I have not found out how to paste together all the graphes, but not how to merge them into a single one.

wanted output (from LibreOffice)

My input data :

Nb_species  OrthoFinder FastOrtho   POGS (no_para)  POGS (soft_para)    proteinOrtho
4   125 142 152 202 114
5   61  65  42  79  44
6   37  29  15  21  8
7   19  17  4   7   5
8   15  10  1   0   0
9   10  2   0   0   0

Thanks !

Micawber
  • 707
  • 1
  • 5
  • 19
  • I see you edited your question in the meantime to work with a different dataset. My solution was aimed at your original phrasing of the question. Do you think you can work with that? – Florian Jul 15 '17 at 13:37

1 Answers1

0

Maybe this can help you in the right direction:

# sample data
df = data.frame(Orthofinder=c(1,2,3), FastOrtho=c(2,3,4),   POGs_no_para=c(1,2,2))

library(reshape2)
library(dplyr)

# first let's convert the dataset: Convert to long format and aggregate.
df = melt(df, id.vars=NULL)
df = df %>% group_by(variable,value) %>% count()

# Then, we create a plot.
ggplot(df, aes(factor(value), n, fill = variable)) + 
  geom_bar(stat="identity", position = "dodge") + 
  scale_fill_brewer(palette = "Set1")

There is enough documentation around on formatting a plot, so I'll leave that to you ;) Hope this helps!

EDIT: Since the question was changed to work with a different dataset as origin while I was typing my answer, here is the modified code to work with that:

df = data.frame(Nb_species = c(4,5,6,7),  OrthoFinder=c(125,142,100,110), FastOrtho=c(100,120,130,140))

library(reshape2)
library(dplyr)
df = melt(df, id.vars="Nb_species")

ggplot(df, aes(factor(Nb_species), value, fill = variable)) + 
  geom_bar(stat="identity", position = "dodge") + 
  scale_fill_brewer(palette = "Set1")
Florian
  • 24,425
  • 4
  • 49
  • 80
  • Yes sorry, I changed cause I suddendly debugged the "geom_text" thing, but with the second dataset :). I'll give it a try. – Micawber Jul 15 '17 at 13:55
  • By the way, when you share your data using dput(), it is much easier to create a working example. Now its very difficult for us to replicate your input data. – Florian Jul 15 '17 at 13:58
  • It works, thank you :). Only one problem : when I add geom_text(aes(label=value)), the values are stacked together and not at the top of their corresponding bar ... – Micawber Jul 15 '17 at 14:13
  • Great! For your new issue, maybe the answer can be found here: https://stackoverflow.com/questions/12018499/how-to-put-labels-over-geom-bar-for-each-bar-in-r-with-ggplot2 Could you please accept my answer if you found it helpful? Maybe tohers might stumble upon the same problem in the future. Thanks! – Florian Jul 15 '17 at 14:22