I'm working with R and I need to build a bar plot starting from an input file in which there are two columns: the first contains the objects while the second contains a numerical variable associated to the related object. The bar plot must represent the frequencies of the numerical variable. These numerical variable coul assumps values from 1 to 15 and 28. With the script reported below, I obtained the desired bar plot.
This is my initial input file dataLig2
Objects;N..of.similar..Glob.Sum...0.83..ligandable.pockets
1czz_A_001_______________;1
1d01_A_001_______________;3
1fbv_A_001_______________;5
1fbv_A_002_______________;1
1fbv_A_007_______________;2
1fs2_A_002_______________;1
This is the script:
dataLig2 <- read.csv("C:/Users/tomma/ScoreDistribution.csv", sep=";", header=T, row.names=1)
set.seed(1); dataLig2 <- data.frame(dataLig2)
dataLig2$Similar_ligandable_pockets <- factor(dataLig2$N..of.similar..Glob.Sum...0.83..ligandable.pockets, levels = 1:28, labels = c(1:15,"...","...","...","...","...","...","...","...","...","...","...","...",28))
dataLig2$group = ifelse(dataLig2$N..of.similar..Glob.Sum...0.83..ligandable.pockets == 1 , "1", ifelse(dataLig2$N..of.similar..Glob.Sum...0.83..ligandable.pockets == 2, "2", ifelse(dataLig2$N..of.similar..Glob.Sum...0.83..ligandable.pockets <=5, "3-5", ifelse(dataLig2$N..of.similar..Glob.Sum...0.83..ligandable.pockets <= 10, "6-10", "11-28"))))
myplot <- ggplot(dataLig2, aes(Similar_ligandable_pockets, fill = group)) +
geom_bar() +
scale_x_discrete(drop = FALSE) +
ggtitle("Ligandability prevista") +
theme(plot.title = element_text(hjust = 0.5)) +
#geom_histogram(color="black") +
scale_fill_manual(values = c("1" = "orange",
"2" = "yellow",
"3-5" = "olivedrab1",
"6-10" = "limegreen",
"11-28" = "green4"))
This is the result
However, I would like to rearrange the legend in such a way as to have the colors in sequence as in the chart and I would like to add above each bar its frequency value. How could I do? Thanks in advance for the suggestions!