0

I have a dataset that looks like this:

data

the first column has all unique values, while the third column consists of 3 factors. I wanted to create a bar chart like so: plot

I've been trying to get it in ggplot and have done

ggplot(data, aes(as.factor(GO), pvalue, fill = Process)) + 
  geom_bar(position = "dodge", width = 0.5, stat = "identity")

but so far my plot looks like so: rplot

what can I do to get the bars in order by the group?

Thanks in advance!

Here is a sample data:

GO <- c("T cell chemotaxis (GO:0010818)", "T cell chemotaxis (GO:0010818)",
            "eosinophil chemotaxis (GO:0048245)", "CXCR3 chemokine receptor binding (GO:0048248)",
            'hemokine activity (GO:0008009)', 'CXCR chemokine receptor binding (GO:0045236)',"cytoplasmic vesicle lumen (GO:0060205)",
            "vesicle lumen (GO:0031983)", "collagen-containing extracellular matrix (GO:0062023)
    ")
    
pvalue <- c(2.49e-02, 3.07e-08, 1.90e-06, 2.22e-08, 1.72e-08, 1.63e-08, 6.18e-03, 3.87e-08
, 3.19e-07)
Process <- as.factor(rep(c("BP", "CP", "MP"),c(3,3,3)))
data <- data.frame(GO, pvalue,process)
vdu12345
  • 73
  • 1
  • 8
  • Does this answer your question? [Order Bars in ggplot2 bar graph](https://stackoverflow.com/questions/5208679/order-bars-in-ggplot2-bar-graph) – semaphorism Nov 02 '20 at 03:35
  • @semaphorism no it doesn't, I tried it out but my bars are still out of order. I did: theTable <- within(BP, GO <- factor(GO, levels=names(sort(table(GO), decreasing=TRUE)))) ggplot(theTable,aes(x=GO, y = -log10(theTable$p.value..FDR.), fill=Process))+geom_bar(stat = "identity") – vdu12345 Nov 02 '20 at 03:48
  • Please post an example of your data in a useable format to enable us to help you troubleshoot. Alternatively, I recommend looking at http://wego.genomics.cn/ which will likely save you a lot of time/effort. – jared_mamrot Nov 02 '20 at 04:09
  • @jared_mamrot thanks for the advice! I've added a small sample of data. The person who I need to turn it into wants it specifically plotted in R – vdu12345 Nov 02 '20 at 04:35

1 Answers1

1

Does this solve your problem?

GO <- c("T cell chemotaxis (GO:0010818)", "T cell chemotaxis (GO:0010818)",
        "eosinophil chemotaxis (GO:0048245)", "CXCR3 chemokine receptor binding (GO:0048248)",
        'hemokine activity (GO:0008009)', 'CXCR chemokine receptor binding (GO:0045236)',"cytoplasmic vesicle lumen (GO:0060205)",
        "vesicle lumen (GO:0031983)", "collagen-containing extracellular matrix (GO:0062023)
    ")

pvalue <- c(2.49e-02, 3.07e-08, 1.90e-06, 2.22e-08, 1.72e-08, 1.63e-08, 6.18e-03, 3.87e-08
            , 3.19e-07)
Process <- as.factor(rep(c("BP", "CP", "MP"),c(3,3,3)))
data <- data.frame(GO, pvalue, Process)
data$GO <- factor(data$GO, levels=unique(GO[order(Process,GO)]), ordered=TRUE)
ggplot(data, aes(x = GO, y = -log(pvalue), fill = Process)) +
  geom_col(position = "dodge", width = 0.5) +
  theme_bw() +
  coord_flip()

example_image.png

jared_mamrot
  • 22,354
  • 4
  • 21
  • 46