2

I am using R for the first time. I'm trying to use ggplot2 to make an arrow genome map and I am having trouble with a few of the details. Here is a snippet of the data:

genome start end gene colour
A 11638 12786 fadA6 #04E762
A 12798 13454 fadE31 #04E762
A 13529 14341 fadE32 #04E762
A 14342 15541 fadE33 #FB5607
A 15627 17168 cyp142 #FB5607

And the code I currently have:

library(ggplot2)
library(gggenes)
ggplot(REXAMPLE3, aes(xmin = start, xmax = end, y = genome, fill = gene)) +
          geom_gene_arrow() +
          geom_gene_label(aes(label = gene)) +
          facet_wrap(~ genome, scales = "free", ncol = 1) +
     scale_fill_brewer(palette = "Set3")+
     theme(legend.position="none")

This is the output of the code

  • I need to colour the genes by function (e.g. green for side chain degradation) but I only know how to colour them individually with gene1="colour". Is there a way to define the colour of each gene with a column in the dataset?

  • Is there a way to make the labels go above the gene instead of within the arrow?

  • Which aes do I change to make the genes look less 'cramped'?

Thank you in advance for any help. It's hard to know what to google when you're completely new to the program.

neilfws
  • 32,751
  • 5
  • 50
  • 63
Lauren
  • 77
  • 1
  • 6
  • 3
    Welcome to SO. If you could provide some example data that will expedite our ability to come up with a solution. https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – emilliman5 Jan 28 '20 at 23:07
  • Thank you for the tip - I have added a snippet of the data – Lauren Jan 28 '20 at 23:21

1 Answers1

3

1) To label your genes according to the color you want (and set in the column "color"), you can pass it in the argument fill of the aes and then use scale_fill_identity().

2) To place the gene label above, you can set y to 1.05 in geom_gene_label, however, I noticed that the size of label is not consistent (cyp142 is smaller than others). So, you can replace geom_gene_label by geom_text and calculate the x position of each to place them in the middle of their corresponding arrows.

Altogether, you can write something like that:

library(ggplot2)
library(gggenes)
ggplot(df, aes(xmin = start, xmax = end, y = genome, fill = colour)) +
  geom_gene_arrow() +
  geom_text(aes(x = end - ((end-start)/2), y = 1.1, label = gene)) +
  facet_wrap(~ genome, scales = "free", ncol = 1) +
  theme(legend.position="none")+
  scale_fill_identity()+
  xlab("")

enter image description here

Does it answer your question ?

dc37
  • 15,840
  • 4
  • 15
  • 32