0

I'm working with R and I need to build a bar plot starting from an input file in which there are two columns: the first contains the objects while the second contains a numerical variable associated to the related object. The bar plot must represent the frequencies of the numerical variable. These numerical variable coul assumps values from 1 to 15 and 28. With the script reported below, I obtained the desired bar plot.

This is my initial input file dataLig2

Objects;N..of.similar..Glob.Sum...0.83..ligandable.pockets
1czz_A_001_______________;1                              
1d01_A_001_______________;3                          
1fbv_A_001_______________;5                          
1fbv_A_002_______________;1                          
1fbv_A_007_______________;2                          
1fs2_A_002_______________;1                          

This is the script:

dataLig2 <- read.csv("C:/Users/tomma/ScoreDistribution.csv", sep=";", header=T, row.names=1)
    
set.seed(1); dataLig2 <- data.frame(dataLig2)
dataLig2$Similar_ligandable_pockets <- factor(dataLig2$N..of.similar..Glob.Sum...0.83..ligandable.pockets, levels = 1:28, labels = c(1:15,"...","...","...","...","...","...","...","...","...","...","...","...",28))

dataLig2$group = ifelse(dataLig2$N..of.similar..Glob.Sum...0.83..ligandable.pockets == 1 , "1", ifelse(dataLig2$N..of.similar..Glob.Sum...0.83..ligandable.pockets == 2, "2", ifelse(dataLig2$N..of.similar..Glob.Sum...0.83..ligandable.pockets <=5, "3-5", ifelse(dataLig2$N..of.similar..Glob.Sum...0.83..ligandable.pockets <= 10, "6-10", "11-28"))))

myplot <- ggplot(dataLig2, aes(Similar_ligandable_pockets, fill = group)) + 
  geom_bar() +
  scale_x_discrete(drop = FALSE) +
  ggtitle("Ligandability prevista") +
  theme(plot.title = element_text(hjust = 0.5)) +
  #geom_histogram(color="black") +  
  scale_fill_manual(values = c("1" = "orange",
                               "2" = "yellow",
                               "3-5" = "olivedrab1",
                               "6-10" = "limegreen",
                               "11-28" = "green4"))

This is the result

enter image description here

However, I would like to rearrange the legend in such a way as to have the colors in sequence as in the chart and I would like to add above each bar its frequency value. How could I do? Thanks in advance for the suggestions!

  • Try: `values = ordered(c("1" = "orange", "2" = "yellow", "3-5" = "olivedrab1", "6-10" = "limegreen", "11-28" = "green4"))`. `ggplot2` does not have a method for ordering these. It uses the vectors own order, which is ANSI (or UTF8) order for character vectors. Eg. for `c("1", "2", "11")` the order is `c("1", "11", "2")` because 1 comes before 2 in both UTF and ANSI characters. See [here](https://stackoverflow.com/a/26873982/10782538) for a duplicate question. – Oliver Mar 29 '21 at 17:37
  • Does this answer your question? [How to reorder a legend in ggplot2?](https://stackoverflow.com/questions/26872905/how-to-reorder-a-legend-in-ggplot2) – Oliver Mar 29 '21 at 17:38
  • 1
    mmmh no, in this way the script change the bar plot colors with others and the legend isn't rearrange – Tommaso Palomba Mar 29 '21 at 17:53
  • Could you please include some sample data (as a data object e.g. `dput(dataLig2)` or `dput(head(dataLig2, 20)` with enough data in order to make your question reproducible? https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Peter Mar 29 '21 at 19:26
  • Ah honestly I was being an idiot in my first comment. Dont `order` your values in `scale_fill_manual`. Order then in your data. `dataLig2$group <- ordered(dataLig2$group, levels = c("1", "2", "3-5", "6-10", "11-28"))`. That should fix your problem. – Oliver Mar 30 '21 at 06:56
  • ah ok! yes in this way works. Thanks! And to add the value of frequency of the barr? – Tommaso Palomba Mar 30 '21 at 07:54

0 Answers0